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Preface 


The  scope,  accessibility,  impact,  and  relevance  of  biophysics  to  many  different 
areas  of  inquiry  have  grown  tremendously  in  recent  years,  driven  by  major  concep- 
tual advances,  innovations  in  instrumentation,  and  powerful  new  analytical,  compu- 
tational, genomic,  and  imaging  methods.  These  advances  enable  us  to  visualize  the 
structures  and  dynamics  of  proteins  and  nucleic  acids  at  atomic  resolution,  observe 
single  molecules  in  real  time,  monitor  processes  within  living  cells  at  high  resolu- 
tion, identify  and  determine  the  chemical  structures  of  individual  molecules  in  com- 
plex biological  samples,  and  define  the  complex  network  of  interactions  that  regulate 
cellular  function.  Biophysical  insights  into  plant,  animal,  and  human  inheritance, 
development,  physiology,  health,  and  disease  have  driven  and  will  continue  to  drive 
many  of  the  major  advances  in  our  quality  of  life.  As  a  result,  many  people  who  are 
not  professional  biophysicists  increasingly  have  a  need  to  learn  about  biophysics. 

This  volume  is  designed  to  enable  students,  scientists  trained  in  other  disciplines, 
clinicians,  members  of  the  chemical,  pharmaceutical,  and  biotech  industries,  intel- 
lectual property  professionals,  and  the  general  public  to  acquire  an  understanding  of 
molecular  biophysics.  In  contrast  to  other  areas  of  biophysics,  molecular  biophysics 
focuses  on  the  complex  and  beautiful  macromolecules  that  encode  all  genetic  infor- 
mation and  form  the  molecular  machinery  that  drives  all  cellular  processes.  This 
volume  provides  an  overview  of  the  development  and  scope  of  molecular  biophys- 
ics and  in-depth  discussions  of  the  major  experimental  methods  that  enable  biologi- 
cal macromolecules  to  be  studied  at  atomic  resolution.  It  also  reviews  the 
physical-chemical  concepts  that  are  needed  to  interpret  the  experimental  results  and 
to  understand  how  the  structure,  dynamics,  and  physical  properties  of  biological 
macromolecules  enable  them  to  perform  their  biological  functions.  Reviews  of 
research  on  three  disparate  biomolecular  machines — DNA  helicases,  ATP  syn- 
thases, and  myosin — illustrate  how  the  combination  of  theory  and  experiment  leads 
to  new  insights  and  new  questions. 

This  volume  is  a  foundational  volume  in  a  series  entitled  Biophysics  for  the  Life 
Sciences  which  is  designed  to  introduce  nonspecialists  to  major  areas  of  biophysics 
and  to  enable  established  investigators  to  learn  about  areas  that  lie  outside  their 
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primary  interests.  The  scope  of  the  series  will  be  broad,  encompassing  molecular, 
cellular,  and  organismal  biophysics.  The  series  is  intended  to  be  synthetic  and  to 
emphasize  concepts,  rather  than  comprehensive  or  didactic.  Each  volume  will  con- 
sist of  up-to-date  presentations  of  research  topics  and  research  approaches,  includ- 
ing observation,  experiment,  and  computation.  When  appropriate,  the  relationship 
of  basic  research  to  translational  research,  clinical  issues,  and  technology  transfer 
will  be  considered. 

The  editors  would  like  to  thank  all  the  people  who  contributed  to  this  volume, 
including  their  colleagues  and  students,  and  the  following  topic  area  experts:  Gerald 
Becker,  Clive  Bagshaw,  Christine  Cremo,  Stanley  Dunn,  Walter  Englander,  R. 
Matthew  Fesinmeyer,  Masa  Futai,  Michael  Geeves,  Wouter  Hoff,  Andreas 
Holzenburg,  Tatyana  Igumenova,  Yijia  Jiang,  Miklos  Kellermayer,  Gerd  Kleeman, 
Ramil  Latypov,  Juliette  Lecomte,  Russell  Middaugh,  Marcia  Newcomer,  Joseph 
Phillips,  Leszek  Poppe,  Kevin  Raney,  Vladimir  Razinkov,  George  Reed,  Gina 
Sosinsky,  Stefan  Stoll,  Jiraj  Svitel;  John  Tesmer,  Michael  Trakselis,  Art  van  der  Est, 
Stanislav  Vitha,  Joseph  Wedekind,  and  Julian  Whitelegge. 


College  Park,  MD,  USA 
Thousand  Oaks,  CA,  USA 
Madison,  WI,  USA 


Norma  M.  Allewell 
Linda  O.  Narhi 
Ivan  Rayment 


About  the  Editors 


Dr.  Norma  M.  Allewell  is  Professor  of  Cell  Biology  and  Molecular  Genetics,  and 
Affiliate  Professor  of  Chemistry  and  Biochemistry  at  the  University  of  Maryland, 
where  she  served  as  Interim  Vice  President  for  Research,  and  Dean  of  the  College 
of  Chemical  and  Life  Sciences  for  a  decade.  She  also  held  faculty  positions  at  the 
University  of  Minnesota,  where  she  was  a  department  head  and  vice  provost; 
Wesleyan  University,  where  she  was  founding  chair  of  the  Department  of  Molecular 
Biology  and  Biochemistry;  and  the  Polytechnic  Institute  of  Brooklyn.  Dr.  Allewell 
holds  a  B.Sc.  (Hon.)  from  McMaster  University  and  a  Ph.D.  in  molecular  biophys- 
ics from  Yale  University.  Her  research  focuses  on  protein  structure,  function,  and 
dynamics,  and  metabolic  regulatory  mechanisms  and  diseases.  She  has  published 
approximately  150  peer-reviewed  papers,  edited  two  books,  and  contributed  several 
book  chapters.  She  is  a  Past  President  of  the  Biophysical  Society  a  former  US  rep- 
resentative to  the  International  Union  of  Pure  and  Applied  Biophysics  and  a  Fellow 
of  the  American  Association  for  the  Advancement  of  Science.  She  is  an  Associate 
Editor  of  the  Journal  of  Biological  Chemistry  and  a  former  Editorial  Board  Member 
for  Biopolymers.  She  served  on  the  National  Academy  of  Sciences  Space  Studies 
Board  Committee  on  Space  Biology  and  Medicine;  the  Board  of  Scientific  Advisors 
for  the  National  Center  for  Biotechnology  Information;  and  the  Advisory  Committee, 
Directorate  of  Biological  Sciences,  National  Science  Foundation,  as  well  as  numer- 
ous review  panels  for  the  National  Institutes  of  Health,  National  Science  Foundation, 
and  Howard  Hughes  Medical  Institute.  She  was  a  Jefferson  Science  Fellow  at  the 
US  State  Department  in  the  Bureau  of  East  Asia  and  the  Pacific  in  201 1-2012. 

Dr.  Linda  O.  Narhi  received  her  B.Sc.  in  Chemistry  (Honors)  from  the  University 
of  Michigan  and  her  Ph.D.  in  Biological  Chemistry  from  UCLA.  She  joined  Amgen 
26  years  ago  and  has  been  a  member  of  the  R&D,  Quality,  and  Operations  groups. 
She  is  currently  a  Scientific  Executive  Director  in  the  Product  Attribute  Science 
group  in  R&D,  where  her  responsibilities  include  solution  stability  assessment  of 
all  protein-based  therapeutic  candidates,  developing  and  implementing  predictive 


vii 


viii 


About  the  Editors 


assays  for  protein  stability  to  process,  storage  and  delivery  conditions.  She  is  also 
responsible  for  developing  and  implementing  assays  to  assess  biological  conse- 
quences of  product  quality  attributes,  especially  protein  aggregates.  She  co-leads 
the  cross  functional  Immunogenicity  Team,  developing,  adapting,  and  implement- 
ing methods  to  assess  relative  immunogenic  potential  of  protein  aggregates  and 
other  quality  attributes,  and  also  co-leads  the  cross  functional  antibody  engineering 
team.  She  is  the  leader  of  the  Technology  Forum  Leadership  Team,  coordinating 
and  leading  the  technology  development  efforts  in  Process  and  Product  develop- 
ment, a  department  of  about  600  scientists  in  R&D.  She  is  the  editor  for  Galenics, 
the  internal  Process  and  Product  Development  journal  for  original  research.  She  is 
a  member  of  the  US  Pharmacological  Convention  expert  committee  on  subvisible 
particle  analysis  for  Biologies,  on  the  steering  committee  for  the  American 
Association  of  Pharmaceutical  Chemists  Focus  group  on  protein  aggregates  and 
Biological  Consequences,  and  for  the  next  several  years  will  co-chair  the  annual 
meeting  on  protein  higher  order  structure  sponsored  by  CASSS,  an  international 
separation  science  society.  She  has  published  well  over  100  articles  in  peer  reviewed 
journals  and  authored  numerous  book  chapters  on  the  subjects  of  protein  folding, 
stability,  aggregation,  and  biotherapeutic  development. 

Dr.  Ivan  Rayment  is  Professor  of  Biochemistry  at  the  University  of  Wisconsin- 
Madison,  where  he  holds  the  Michael  G.  Rossmann  Professorship  in  Biochemistry. 
Dr.  Rayment  received  his  B.Sc.  and  Ph.D.  in  Chemistry  from  Durham  University, 
England.  He  has  a  wide  range  of  interests  in  structural  biology  and  has  made  semi- 
nal contributions  to  our  understanding  of  the  structural  basis  of  motility,  enzyme 
evolution,  cobalamin  biosynthesis,  and  transposition.  He  has  published  over  180 
peer  reviewed  papers.  He  is  a  Fellow  of  the  American  Association  for  the 
Advancement  of  Science  and  a  Fellow  of  the  Biophysical  Society. 


Contents 


1  Introduction   1 

Norma  M.  Allewell,  Linda  O.  Narhi,  and  Ivan  Rayment 

2  Structural,  Physical,  and  Chemical  Principles   17 

Norma  M.  Allewell,  Linda  O.  Narhi,  and  Ivan  Rayment 

Part  I    The  Experimental  Tools  of  Molecular  Biophysics 

3  Optical  Spectroscopic  Methods  for  the  Analysis 

of  Biological  Macromolecules   33 

Linda  O.  Narhi,  Cynthia  H.  Li,  Ranjini  Ramachander,  Juraj  Svitel, 
and  Yijia  Jiang 

4  Diffraction  and  Scattering  by  X-Rays  and  Neutrons   91 

Ivan  Rayment 

5  Nuclear  Magnetic  Resonance  Spectroscopy   113 

Thomas  C.  Pochapsky  and  Susan  Sondej  Pochapsky 

6  Electron  Paramagnetic  Resonance  Spectroscopy   175 

John  H.  Golbeck  and  Art  van  der  Est 

7  Mass  Spectrometry   215 

Igor  A.  Kaltashov  and  Cedric  E.  Bobst 

8  Single-Molecule  Methods   257 

Paul  J.  Bujalowski,  Michael  Sherman,  and  Andres  F.  Oberhauser 

Part  II    Biological  Macromolecules  as  Molecular  Machines: 
Three  Examples 

9  Helicase  Unwinding  at  the  Replication  Fork   291 

Divya  Nandakumar  and  Smita  S.  Patel 


ix 


x  Contents 

10  Rotary  Motor  ATPases   313 

Stephan  Wilkens 

11  Biophysical  Approaches  to  Understanding  the  Action 

of  Myosin  as  a  Molecular  Machine   341 

Mihaly  Kovacs  and  Andras  Malnasi-Csizmadia 

Part  III    Future  Prospects 

12  Future  Prospects   365 

Norma  M.  Allewell,  Igor  A.  Kaltashov,  Linda  O.  Narhi, 
and  Ivan  Rayment 

Index   381 


Contributors 


Cedric  E.  Bobst  Department  of  Chemistry,  University  of  Massachusetts-Amherst, 
Amherst,  MA,  USA 

Paul  J.  Bujalowski  Department  of  Biochemistry  and  Molecular  Biology, 
University  of  Texas  Medical  Branch  at  Galveston,  Galveston,  TX,  USA 

John  H.  Golbeck  Department  of  Biochemistry  and  Molecular  Biology,  The 
Pennsylvania  State  University,  University  Park,  PA,  USA 

Department  of  Chemistry,  The  Pennsylvania  State  University,  University  Park, 
PA,  USA 

Yijia  Jiang  Research  and  Development,  Amgen  Inc.,  Thousand  Oaks,  CA,  USA 

Igor  A.  Kaltashov  Department  of  Chemistry,  University  of  Massachusetts- 
Amherst,  Amherst,  MA,  USA 

Mihaly  Kovacs  Department  of  Biochemistry,  Eotvos  Lorand  University-Hungarian 
Academy  of  Sciences  "Momentum"  Motor  Enzymology  Research  Group,  Eotvos 
Lorand  University,  Budapest,  Hungary 

Cynthia  H.  LI  Research  and  Development,  Amgen  Inc.,  Thousand  Oaks,  CA, 
USA 

Andras  Malnasi-Csizmadia  Department  of  Biochemistry,  Eotvos  Lorand 
University-Hungarian  Academy  of  Sciences  Molecular  Biophysics  Research  Group, 
Eotvos  Lorand  University,  Budapest,  Hungary 

Divya  Nandakumar  Department  of  Biochemistry  and  Molecular  Biology,  Robert 
Wood  Johnson  Medical  School,  Rutgers  University,  Piscataway,  NJ,  USA 


xi 


xii 


Contributors 


Andres  F.  Oberhauser  Department  of  Biochemistry  and  Molecular  Biology, 
University  of  Texas  Medical  Branch  at  Galveston,  Galveston,  TX,  USA 

Department  of  Neuroscience  and  Cell  Biology,  University  of  Texas  Medical  Branch 
at  Galveston,  Galveston,  TX,  USA 

Sealy  Center  for  Structural  Biology  and  Molecular  Biophysics,  University  of  Texas 
Medical  Branch  at  Galveston,  Galveston,  TX,  USA 

Smita  S.  Patel  Department  of  Biochemistry  and  Molecular  Biology,  Robert  Wood 
Johnson  Medical  School,  Rutgers  University,  Piscataway,  NJ,  USA 

Susan  Sondej  Pochapsky  Department  of  Chemistry  and  Rosenstiel  Basic  Medical 
Sciences  Research  Institute,  Brandeis  University,  Waltham,  MA,  USA 

Thomas  C.  Pochapsky  Department  of  Chemistry  and  Rosenstiel  Basic  Medical 
Sciences  Research  Institute,  Brandeis  University,  Waltham,  MA,  USA 

Ranjini  Ramachander  Research  and  Development,  Amgen  Inc.,  Thousand  Oaks, 
CA,  USA 

Michael  Sherman  Department  of  Biochemistry  and  Molecular  Biology,  University 
of  Texas  Medical  Branch  at  Galveston,  Galveston,  TX,  USA 

Juraj  Svitel  Research  and  Development,  Amgen  Inc.,  Thousand  Oaks,  CA,  USA 

Art  van  der  Est  Department  of  Chemistry,  Brock  University,  St.  Catharines, 
ON,  Canada 

Stephan  Wilkens  Department  of  Biochemistry  and  Molecular  Biology,  SUNY 
Upstate  Medical  University,  Syracuse,  NY,  USA 


Chapter  1 

Introduction 


Norma  M.  Allewell,  Linda  O.  Narhi,  and  Ivan  Rayment 


Abstract  This  introductory  chapter  reviews  the  history  and  development  of 
biophysics  and  provides  an  overview  of  topics  covered  in  the  volume.  It  introduces 
the  major  experimental  methods  available  for  studying  biological  macromolecules 
(optical  spectroscopy,  X-ray  and  neutron  diffraction  and  scattering,  electron  para- 
magnetic resonance  spectroscopy,  nuclear  magnetic  resonance  spectroscopy,  mass 
spectrometry,  and  single  molecule  methods).  The  role  of  biological  macromolecules 
as  molecular  machines  is  discussed,  using  as  examples  the  three  macromolecular 
machines  presented  in  subsequent  chapters:  helicases,  membrane  ATPases,  and 
myosin.  The  chapter  concludes  with  brief  discussions  of  the  role  of  computation 
and  future  prospects  in  three  areas:  X-ray  and  neutron  diffraction  and  scattering, 
mass  spectrometry,  and  drug  and  pharmaceutical  development. 

Keywords  Biomolecular  machines  •  Diffraction  •  Electron  paramagnetic  reso- 
nance spectroscopy  •  Experimental  methods  •  Future  prospects  •  Historical  over- 
view •  Mass  spectrometry  •  Molecular  machines  •  Optical  spectroscopy  •  Nuclear 
magnetic  resonance  spectroscopy  •  Single  molecule  methods 
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1.1    What  Is  Biophysics? 

Biophysics  seeks  to  define  the  physical  properties  of  living  systems  and  to  understand 
the  relationship  between  the  physical  properties  of  biological  molecules,  cells,  tis- 
sues and  organs  and  their  biological  function.  The  science  of  biophysics  provides 
both  a  conceptual  framework  and  powerful  experimental  and  theoretical  tools. 
Biophysicists  use  these  approaches  to  work  towards  an  understanding  of  the  physi- 
cal basis  of  life  at  every  level — from  the  physiological  and  neurological  to  the  cel- 
lular and  molecular.  Examples  of  the  kinds  of  questions  biophysicists  ask  include 
the  following: 

•  How  do  muscles  contract? 

•  How  do  nerves  send  signals  to  the  brain? 

•  How  do  we  see  and  hear? 

•  How  do  we  taste  and  smell  chemicals? 

•  How  do  plant  cells  use  light  to  convert  carbon  dioxide  and  water  to  nutrients? 

•  How  do  protozoa  and  sperm  swim? 

•  How  do  enzymes  recognize  their  substrates  and  catalyze  their  conversion  to  products? 

•  How  do  the  three-dimensional  structures  of  macromolecules  enable  them  to  per- 
form their  function? 

•  What  physical  forces  determine  the  three-dimensional  structures  of  macromole- 
cules and  their  assembly  to  form  larger  structures,  and  how  does  assembly  affect 
function? 

Completely  answering  these  questions  requires  integrating  knowledge  obtained 
by  scientists  from  many  different  disciplines — biochemists,  chemists,  cell  biolo- 
gists, geneticists,  microbiologists,  neurobiologists,  physiologists,  plant  biologists, 
and  others.  Thus  biophysics  is  inherently  interdisciplinary  and  cross-disciplinary. 
The  experimental  tools  of  physics  and  chemistry  enable  visualization  of  both  the 
structures  and  dynamics  of  organs,  tissues,  cells,  and  molecules  and  allow  the 
molecular  processes  and  chemical  reactions  that  underlie  biological  function  to  be 
explored  and  defined.  The  conceptual  and  computational  tools  of  mathematics  and 
computer  science  are  required  all  along  the  way — first,  to  translate  the  raw  experi- 
mental data  into  meaningful  information  about  the  process  under  investigation,  and 
then  to  develop,  refine,  and  test  models  of  how  the  process  might  occur.  Usually  the 
initial  models  are  descriptive  and  qualitative,  but  as  more  and  better  information  is 
obtained,  the  models  can  be  made  increasingly  quantitative,  enabling  them  to  be 
tested  more  rigorously.  Importantly,  quantitative  models  incorporate  experimentally 
derived  knowledge  about  the  process  that  generally  leads  to  hypotheses  that  extend 
beyond  the  scope  of  the  original  question. 

Over  the  past  several  decades,  groundbreaking  advances  have  increased  the  scope, 
impact,  and  relevance  of  biophysics  tremendously.  Fundamental  biophysical  questions 
are  being  addressed  at  every  level  of  biology.  Experimental  biophysics  continues  to 
generate  new  challenges  and  opportunities  in  the  physical  sciences,  mathematics,  and 
computation,  and,  in  turn,  new  developments  in  these  fields  create  new  opportunities  in 
experimental  biophysics.  There  are  also  important  interfaces  between  biophysics  and 
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other  rapidly  developing  interdisciplinary  fields,  such  as  bioengineering,  genomics, 
and  bioinformatics.  Finally,  biophysics  has  much  to  contribute  to  biotechnology,  drug 
discovery,  and  clinical  medicine,  as  well  as  many  other  fields. 


1.2    How  Did  Biophysics  Develop? 

The  origins  of  biophysics  go  back  at  least  as  far  as  Leonardo  da  Vinci's  anatomical 
and  biomechanical  studies  of  the  human  skeleton,  muscles,  and  heart  in  the  six- 
teenth century  (Fig.  1.1);  William  Harvey's  analysis  of  blood  circulation  in  the  mid- 
seventeenth  century;  and  Antonie  van  Leeuwenhoek's  observations  of  cell  structure 
and  function  in  the  late  seventeenth  century  (Fig.  1.2).  From  that  time  forward, 
microscopists,  anatomists,  physiologists,  physicians,  and  others  who  worked  to 
understand  how  cells,  tissues,  and  organs  function  pursued  studies  that  often  had  a 
substantial  biophysical  component.  In  the  mid-nineteenth  century,  Claude  Bernard's 
investigations  of  organ  function  and  the  mechanisms  that  enable  organisms  to  main- 
tain a  steady  state  laid  the  foundations  for  systems  physiology.  In  the  mid-twentieth 


Fig.  1.1  Leonardo  da  Vinci's  sketch  of  the  heart  showing  the  distribution  of  the  blood  vessels. 
Image  provided  to  Springer  Images  by  BioMed  Central 
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Fig.  1.2  From  a  portrait  by  Van  Verkolje  of  Antonie  Leeuwenhoek,  a  Dutch  tradesman  who  lived 
in  the  seventeenth  century  and  developed  and  built  the  first  microscopes  that  enabled  microorgan- 
isms and  plant  and  animal  cells  to  be  viewed  (Reproduced  with  permission  of  the  Rijksmuseum) 

century,  Alan  Hodgkin  and  Andrew  Huxley's  mathematical  reconstruction  of  the 
nerve  impulse  initiated  the  development  of  cellular  biophysics. 

Biophysics  did  not  emerge  as  a  recognized  scientific  field  until  the  middle  of  the 
twentieth  century,  when  a  visionary  group  of  atomic  physicists  turned  their  attention 
to  the  major  unsolved  problems  in  biology.  Erwin  Schrodinger,  one  of  the  founders 
of  quantum  mechanics,  was  a  member  of  this  group  and  in  1944  published  a  series 
of  lectures  entitled  "What  is  Life?"  that  proved  to  be  transformational.  In  these  lec- 
tures, Schrodinger  defined  the  central  question  for  physicists  (and  chemists)  enter- 
ing the  new  field  of  biophysics  as  follows:  "How  can  the  events  in  space  and  time 
which  take  place  within  the  spatial  boundaries  of  a  living  organism  be  accounted  for 
by  physics  and  chemistry?"  He  then  went  on  to  argue  that  "The  obvious  inability  of 
present  day  physics  and  chemistry  to  account  for  such  events  is  no  reason  at  all  for 
doubting  that  they  can  be  accounted  for  by  those  sciences." 

Soon  after  the  publication  of  "What  is  Life?"  molecular  biophysics  emerged  as  a 
new  scientific  discipline  with  the  discovery  that  the  three-dimensional  structures  of 
biological  macromolecules  could  be  determined  by  X-ray  diffraction.  This  discov- 
ery enabled  the  hard  won  gains  of  several  generations  of  biochemists  who  had 
worked  to  define  the  chemical  structures  and  biological  functions  of  biological  mol- 
ecules to  be  taken  to  a  new  level.  The  success  of  Dorothy  Crowfoot  Hodgkin 
(Fig.  1.3)  in  determining  the  three-dimensional  structures  of  cholesterol,  penicillin, 
and  Vitamin  B12  was  a  ground  breaking  example.  Building  on  the  experimental 
work  of  Maurice  Wilkins  and  Rosalind  Franklin,  James  Watson  and  Francis  Crick 
(Fig.  1.4)  proposed  in  1953  that  DNA  had  a  double  helical  structure  that  enabled  it 
to  encode  genetic  information.  A  full  history  of  the  science  that  paved  the  way  for 
this  discovery  can  be  found  in  [1].  At  the  same  time,  Max  Perutz  began  to  develop 


Fig.  1.3  Dorothy  Crowfoot  Hodgkin  was  awarded  the  Nobel  Prize  in  Chemistry  in  1964  for  her 
discoveries,  through  the  use  of  X-ray  diffraction,  of  the  structures  of  biologically  important  mol- 
ecules, including  penicillin,  vitamin  B-12  and  the  protein  hormone  insulin  (1969).  Her  achieve- 
ments included  not  only  these  structure  determinations  and  the  scientific  insight  they  provided  but 
also  the  development  of  methods  that  made  such  structure  determinations  possible  (Reproduced 
with  permission  of  the  Royal  Society) 


Fig.  1.4  James  Watson  and  Francis  Crick  shown  with  their  model  of  the  DNA  double  helix.  Together 
with  Maurice  Wilkins,  they  received  the  Nobel  Prize  in  1962  for  "their  discoveries  concerning  the 
molecular  structure  of  nucleic  acids  and  its  significance  for  information  transfer  in  living  material." 
Research  on  the  crystal  structure  of  DNA  by  Rosalind  Franklin  in  Maurice  Wilkins'  laboratory  was 
critical  to  the  development  of  this  model  (Reproduced  with  permission  of  Photo  Researchers,  Inc.) 
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Fig.  1.5  Linus  Pauling,  Max  Delbruck  and  Max  Perutz  in  1962.  All  three  won  Nobel  Prizes:  Linus 
Pauling  for  his  research  into  the  nature  of  the  chemical  bond  and  its  application  to  the  elucidation 
of  the  structure  of  complex  substances;  Max  Delbruck,  with  Alfred  Hershey  and  Salvador  Luria, 
for  their  discoveries  concerning  the  replication  mechanism  and  the  genetic  structure  of  viruses;  and 
Max  Perutz,  with  John  Kendrew,  for  their  studies  of  the  structures  of  globular  proteins  (Reproduced 
with  permission  of  Oregon  State  University  Libraries) 


Fig.  1.6  The  fold  of  the  polypeptide  backbone  of  the  four  subunits  in  human  deoxyhemoglobin  at 
1.74  A  resolution,  as  reported  by  Fermi  and  Perutz.  The  a-chains  are  colored  in  white  and  the 
P-chains  are  depicted  in  green.  The  four  heme  groups  that  bind  oxygen  are  shown  in  black,  with 
the  central  iron  ion  in  red.  Nitrogen  atoms  of  histidine  side  chains  that  interact  with  the  iron  ion  are 
shown  in  blue  (PDB  accession  number  4HHP) 

experimental  and  computational  methods  for  visualizing  the  three-dimensional 
structures  of  proteins  at  atomic  resolution  (Fig.  1.5).  By  1959,  Perutz  was  able  to 
report  the  three-dimensional  structure  of  hemoglobin,  the  major  carrier  for  oxygen 
in  the  blood  (Fig.  1.6),  and  shortly  thereafter,  John  Kendrew  and  colleagues  deter- 
mined the  structure  of  myoglobin,  which  sequesters  oxygen  in  tissues.  Since  that 
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time,  more  than  80,000  protein  structures  have  been  reported.  The  relationship 
between  the  structures  of  proteins  and  nucleic  acids  and  their  function  is  discussed 
in  detail  in  [2-5]. 

Schrodinger  and  his  fellow  physicists  recognized  that  understanding  how  genetic 
information  was  stored  and  transmitted  from  one  generation  to  the  next  is  as  impor- 
tant as  understanding  the  structure  and  physical  properties  of  macromolecules. 
Another  member  of  this  group,  Max  Delbriick  (Fig.  1.5),  played  a  critical  role  by 
arguing  successfully  that  bacterial  viruses  provide  the  best  opportunity  to  under- 
stand the  flow  of  genetic  information  in  living  systems,  because  they  are  among  the 
simplest  organisms.  The  work  of  Delbriick,  his  colleagues,  and  their  successors  laid 
the  foundations  for  a  second  new  scientific  discipline,  molecular  biology,  based  on 
the  successful  elucidation  of  the  genetic  code.  Over  the  next  6  decades,  the  interac- 
tion and  weaving  together  of  molecular  biology  and  molecular  biophysics  have  led 
to  a  rich  succession  of  dazzling  scientific  discoveries,  many  of  which  have  been 
honored  with  Nobel  Prizes  [6-12]. 


1.3    The  Experimental  Tools  of  Molecular  Biophysics 

While  all  the  molecules  of  the  cell  play  a  role  in  its  function,  molecular  biophysics 
is  particularly  focused  on  macromolecules,  large  molecules  with  molecular  weights 
in  the  tens  to  hundreds  of  thousand  Daltons.  Macromolecules  consist  of  small 
monomeric  units  linked  together  to  form  large  polymers.  Often  single  polymers 
assemble  to  form  larger  multimeric  units,  called  macromolecular  assemblies.  The 
best  known  and  most  intensively  studied  biological  macromolecules  are  nucleic 
acids  and  proteins,  both  of  which  are  critically  important  to  the  functioning  of  living 
systems.  Nucleic  acids  encode  the  genetic  information  of  the  cell,  and  act  as  tem- 
plates for  the  synthesis  of  proteins,  while  proteins  are  the  molecular  machines  that 
enable  most  functions  of  the  cell  to  be  carried  out. 

As  is  the  case  throughout  science,  advances  in  biophysics  are  built  on  observa- 
tion and  measurement.  Because  molecules  and  cells  are  much  too  small  to  be  seen 
with  the  naked  eye,  molecular  and  cellular  biophysicists  depend  upon  various 
kinds  of  scientific  instrumentation  that  gather  different  kinds  of  information.  The 
physical  properties  of  molecules  that  molecular  biophysicists  are  most  interested 
in  are  their  three-dimensional  structures  and  dynamics  and  their  energetic,  electri- 
cal, magnetic,  and  mechanical  properties.  This  volume  describes  the  major  experi- 
mental approaches  that  are  currently  used  to  probe  and  analyze  the  structural, 
dynamic,  and  physical  properties  of  biological  macromolecules  at  atomic  resolu- 
tion. The  next  section  presents  a  brief  survey  of  these  methods,  with  many  inter- 
esting and  important  features  of  each  approach  deferred  for  discussion  in 
subsequent  chapters.  The  last  three  chapters  of  the  volume  present  three  examples 
of  how  experimental  results  obtained  by  a  variety  of  approaches  can  be  integrated 
and  used  to  develop  a  comprehensive  understanding  of  how  proteins  function  as 
molecular  machines. 
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1.3.1    Absorption,  Circular  Dichroic,  and  Fluorescent 
Spectroscopy 

Several  forms  of  absorption  spectroscopy,  discussed  in  Chap.  3,  monitor  the  absorp- 
tion of  ultraviolet,  visible,  and  infrared  light  by  biological  molecules.  The  intrinsic 
chromophores  of  DNA  are  the  nucleotide  bases,  while  in  proteins  both  the  peptide 
backbone  and  the  aromatic  amino  acids  have  specific  absorbances  that  can  be  exploited 
to  provide  information  about  the  structure  of  the  molecules.  Most  forms  of  absorption 
spectroscopy  do  not  have  the  resolution  required  to  acquire  highly  detailed  structural 
information.  They  are,  however,  invaluable  in  characterizing  larger  elements  of 
molecular  structure  and  monitoring  structural  changes  and  chemical  processes.  In 
addition,  attachment  of  specific  chemical  groups  that  have  unique  spectral  properties 
(external  chromophores)  may  allow  higher  resolution  information  to  be  obtained. 

Circular  dichroic  spectroscopy  measures  the  difference  in  absorption  of  left-  and 
right-handed  circularly  polarized  light  and  is  often  used  to  probe  the  overall  struc- 
ture of  chiral  macromolecules.  In  proteins,  different  types  of  secondary  structure 
have  specific  signals  in  the  far  ultraviolet  region,  while  the  aromatic  amino  acids 
and  disulfide  bonds  have  signals  in  the  near  ultraviolet  region  that  can  be  used  to 
follow  changes  in  tertiary  structure. 

Fluorescence  spectroscopy  monitors  the  emission  of  light  from  chemical  groups 
that  are  also  able  to  absorb  light  of  a  shorter  wavelength.  Fluorescence  spectroscopy 
has  more  specificity  and  resolution  than  absorption  spectroscopy,  but  decreased  sig- 
nal strength  relative  to  absorption  spectroscopy  because  fewer  chemical  groups  are 
fluorescent.  The  specificity  of  fluorescence  spectroscopy  has  made  it  possible  to 
develop  molecular  rulers  that  measure  the  distance  between  two  different  fluores- 
cent groups  attached  to  a  biological  macromolecule.  When  the  two  groups  are 
appropriately  matched,  fluorescent  energy  will  be  transferred  between  them.  The 
efficiency  of  energy  transfer  depends  upon  the  distance  between  the  two  molecules, 
and  so  measuring  the  efficiency  of  transfer  enables  the  distance  to  be  determined  at 
high  resolution.  Changes  in  the  efficiency  of  energy  transfer  with  time  also  allow 
fluctuations  in  molecular  structure  to  be  monitored  and  analyzed. 

In  common  with  most  spectroscopic  methods,  optical  spectroscopy  provides  the 
average  for  the  total  population,  but  cannot  differentiate  between  subpopulations. 
Thus  large  changes  in  the  magnitude  of  the  signal  could  stem  from  small  uniform 
changes  across  the  entire  molecular  population  in  the  test  samples,  or  could  result 
from  large  changes  in  structure  that  occur  in  only  a  fraction  of  the  population. 


1.3.2   Atomic-Level  Structural  Methods 

Only  two  experimental  approaches,  both  developed  by  physicists,  enable  the  struc- 
tures of  macromolecules  to  be  determined  at  atomic  resolution.  X-ray  crystallog- 
raphy, discussed  in  Chap.  4,  depends  upon  the  diffraction  of  X-rays  by  crystals  of 
the  molecule  being  studied.  Since  the  structures  determined  by  X-crystallography 
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are  derived  from  molecules  in  a  crystal,  the  information  that  they  provide  about  the 
flexibility  and  dynamics  of  the  molecule  may  not  reflect  the  full  flexibility  and 
dynamics  of  the  molecule  in  solution. 

The  second  approach,  nuclear  magnetic  resonance  spectroscopy — often  abbrevi- 
ated NMR — is  discussed  in  Chap.  5.  NMR  exploits  the  fact  that  many  of  the  atoms 
in  biological  molecules  have  nuclear  spins  that  align  with  an  externally  applied  elec- 
tric field.  When  a  radio  frequency  pulse  is  applied,  the  alignment  changes,  and  the 
change  in  alignment  can  be  measured.  The  response  of  individual  nuclei  to  the  radio 
frequency  pulse  depends  upon  the  location  of  other  nuclei.  Mathematical  analysis  of 
the  response  of  the  nuclear  spins  of  different  kinds  of  atoms  in  the  molecule  to  vari- 
ous pulses  enables  the  three-dimensional  structure  of  the  molecule  to  be  deduced. 

In  contrast  to  X-ray  crystallography,  most  structures  determined  by  NMR  are 
derived  from  molecules  in  solution  and  so  provide  more  information  about  the  dynam- 
ical properties  of  the  molecule.  However,  there  is  an  upper  limit  on  the  molecular 
weight  of  molecules  that  can  be  studied  by  NMR  (currently  in  the  range  of  60,000  Da), 
whereas  in  principle  there  is  no  limit  on  the  molecular  weights  of  molecules  that  can 
be  studied  by  X-ray  crystallography,  provided  they  can  be  crystallized. 

Although  only  X-crystallography  and  NMR  allow  the  structure  of  the  entire  mol- 
ecule to  be  defined  at  atomic  resolution,  several  other  approaches  provide  atomic 
resolution  information  about  specific  elements  of  the  structure.  Electron  paramag- 
netic resonance  spectroscopy — often  abbreviated  EPR — is  discussed  in  Chap.  6. 
EPR  spectroscopy  depends  upon  the  magnetic  dipole  moments  of  metal  ions  or  free 
radicals  that,  like  nuclear  spins,  align  with  magnetic  fields  and  change  their  align- 
ment when  a  microwave  pulse  is  applied.  EPR  is  often  used  to  obtain  atomic-level 
information  about  metal  centers  in  macromolecules,  many  of  which  are  proteins — 
for  example,  chlorophyll  and  iron  containing  proteins. 


1.3.3    Mass  Spectrometry 

Mass  spectrometry,  discussed  in  Chap.  7,  is  a  versatile  and  evolving  technology  that 
has  several  different  modes  and  many  different  applications  in  both  the  physical  and 
life  sciences.  In  all  mass  spectrometry  experiments,  molecules  are  broken  into 
charged  fragments,  either  by  collisions  with  other  molecules  or  by  fast  moving  elec- 
trons. The  fragments  are  then  separated  and  analyzed.  Mass  spectrometry  is  rou- 
tinely used  to  sequence  biopolymers  and  is  also  a  powerful  tool  for  analyzing 
interactions  between  molecules  and  structural  fluctuations  within  molecules. 


1.3.4    Single  Molecule  Methods 

As  powerful  as  they  are,  all  of  the  methods  discussed  above  have  a  major  limitation; 
they  monitor  only  the  average  structures  and  behaviors  of  an  enormous  ensemble  of 
molecules  whose  behavior  is  not  synchronized,  as  discussed  above.  As  a  result,  many 
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details  of  molecular  structure  and  behavior  that  are  essential  to  function  cannot  be 
explored.  However,  advances  in  recent  decades  have  led  to  the  development  of  five 
methods,  briefly  described  below,  for  studying  the  structure  and  behavior  of  single 
molecules.  These  single  molecule  methods  make  use  of  different  physical  phenomena, 
provide  different  kinds  of  information,  and  have  different  advantages  and  disadvan- 
tages. They  are  discussed  in  detail  in  Chap.  8  and  in  another  volume  in  this  series  [13]. 

Three  of  these  methods  provide  structural  information  that  cannot  be  obtained  by 
X-ray  crystallography  or  NMR.  Cryo-electron  microscopy  (electron  microscopy 
carried  out  at  very  low  temperatures)  is  often  used  to  provide  medium  resolution 
structures  of  very  large  molecular  assemblies  that  cannot  be  crystallized  and  studied 
by  X-ray  crystallography.  It  has  the  advantage  over  classical  electron  microscopy  in 
not  requiring  fixatives  that  may  distort  the  structure.  Atomic  Force  Microscopy 
(AFM)  uses  a  microprobe  to  map  the  surfaces  of  individual  molecules  deposited  on 
a  solid  support — for  example  DNA  molecules  on  a  glass  plate.  Total  Internal 
Reflectance  Microscopy  (TIRFM),  like  AFM,  is  designed  to  study  the  structure  of 
surfaces.  However,  the  two  microscopes  are  based  on  different  physical  principles 
and  are  used  to  study  different  phenomena.  In  TIRFM  experiments,  the  sample  is 
deposited  at  a  glass-liquid  interface  and  the  fluorescence  generated  by  light  totally 
internally  reflected  from  the  glass-liquid  interface  is  monitored.  This  method  has 
been  particularly  useful  in  studying  the  structure  of  cell  membranes  in  living  cells. 

The  two  remaining  single  molecule  methods  provide  dynamic,  as  well  as  struc- 
tural information.  Molecular  tweezers  measure  the  response  of  a  molecule  to  an 
imposed  force,  either  optical  or  magnetic,  and  provide  information  about  molecular 
elasticity  and  internal  motions.  This  method  has  been  particularly  important  in 
studying  molecular  motors;  for  example,  muscle  and  cellular  actin  and  myosin  (dis- 
cussed in  Chap.  11).  Single  Molecule  Fluorescence  Resonance  Energy  Transfer 
(FRET)  makes  use  of  the  strategy  described  above  for  molecular  rulers — monitor- 
ing the  transfer  of  light  energy  absorbed  by  one  chromophore  at  a  given  wavelength 
to  a  second  chromophore  which  fluoresces  at  a  longer  wavelength.  However,  while 
molecular  rulers  were  originally  used  to  study  the  average  distance  between  the  two 
chromophores  in  an  ensemble  of  molecules,  FRET  can  now  be  carried  out  with 
single  molecules,  rather  than  an  ensemble,  enabling  the  dynamics  of  single  mole- 
cules to  be  studied  over  time. 


1.4    The  Role  of  Computation  in  Biophysics 

The  high  resolution  methods  of  molecular  biophysics  generate  huge  amounts  of  raw 
data  that  can  only  be  collected,  stored,  analyzed,  and  interpreted  with  the  aid  of  pow- 
erful computers  and  a  variety  of  sophisticated  mathematical  tools.  Advances  in  math- 
ematical theory  and  computing  power  have  driven  many  advances  in  molecular 
biophysics.  Examples  include  the  introduction  of  computer  graphics  to  model  molec- 
ular structure  in  the  1970s  and  the  use  of  molecular  dynamics  techniques  to  simulate 
intramolecular  motion  in  the  1980s.  The  need  for  computational  power  has  increased 
dramatically  with  the  advent  of  genomic  and  proteomic  approaches,  bioinformatics, 
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and  powerful  new  cell  imaging  methods.  These  advances  have  opened  new  frontiers 
in  terms  of  understanding  how  macromolecules  function  in  a  cellular  context.  A  full 
treatment  of  computational  biophysics  is  beyond  the  scope  of  this  volume. 


1.5    Biological  Macromolecules  as  Molecular  Machines 

One  of  the  most  fundamental  and  important  insights  to  emerge  from  more  than  a 
half  century  of  biophysical  investigations  of  biological  macromolecules  is  that  they 
are  the  molecular  machines  that  perform  the  essential  functions  of  the  cell.  Like  the 
machines  developed  during  the  industrial  revolution,  they  consume  energy,  have 
moving  parts,  and  do  work.  Like  the  engines  and  motors  of  everyday  life,  the  levels 
of  their  activity  can  be  regulated  to  adjust  to  changes  in  external  conditions.  In  addi- 
tion, like  the  machines  of  the  twenty-first  century,  biomolecular  machines  store 
information  that  directs  and  modulates  their  function;  however,  these  molecular 
machines  have  dimensions  of  nanometers  rather  than  meters. 

Most  of  the  molecular  machines  of  the  cell  are  proteins  that  perform  a  multitude 
of  functions.  Some  are  motor  proteins  that  have  the  ability  to  move  along  fibers  and 
surfaces  within  the  cell.  For  example,  the  movement  of  myosin  molecules  along  an 
actin  fiber  drives  muscle  contraction  (Fig.  1.7).  Most  non-muscle  cells  also  contain 
other  forms  of  actin  and  myosin,  as  well  as  other  motor  proteins  that  are  responsible 
for  cell  motility,  the  mechanics  of  cell  division,  and  transport  of  materials  within 
cells;  for  example,  dynein  and  kinesin. 

Enzymes,  the  proteins  that  catalyze  chemical  reactions  in  cells,  are  another  class 
of  molecular  machines,  analogous  to  the  machines  that  convert  raw  materials  into 
products  in  industrial  factories.  Every  cell  has  hundreds  of  different  enzymes  that 
act  on  different  substrates  to  convert  them  into  products  needed  by  the  cell.  Among 


Fig.  1.7  Molecular  model  of 
subfragment- 1  of  chicken 
myosin  docked  onto  actin 
(PDB  accession  number 
2MYS  for  myosin) 
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the  best  known  and  most  widely  studied  are  the  enzymes  of  metabolism,  which 
break  the  molecules  in  food  into  smaller  molecules,  many  of  which  are  used  to  syn- 
thesize molecules  needed  by  the  organism.  Other  enzymes  are  involved  in  cell  sig- 
naling pathways,  cell  death,  and  the  synthesis  of  macromolecules  such  as  DNA, 
RNA,  and  proteins. 

Both  the  plasma  membrane  that  encloses  the  cell  and  intracellular  membranes 
that  form  compartments  within  cells  have  membrane  proteins  imbedded  in  or 
attached  to  them.  These  membrane  proteins  perform  several  different  functions. 
Some  control  movement  of  ions  or  molecules  across  membranes.  Others  are  com- 
ponents of  signaling  pathways  that  transfer  signals  from  one  side  of  the  membrane 
to  the  other  and  modulate  the  action  of  signaling  pathways  within  cells.  Some  cell 
membrane  enzymes  are  part  of  the  immune  system,  and  act  to  neutralize  toxic  mol- 
ecules or  pathogens. 

While  proteins  frequently  function  autonomously  as  molecular  machines,  they 
also  assemble  with  other  types  of  macromolecules  to  form  more  complex  structures 
with  novel  properties.  One  of  the  best-known  examples  is  the  ribosome,  a  huge 
molecular  machine  that  plays  a  central  role  in  the  synthesis  of  proteins  from  an 
m-RNA  template.  Eukaryotic  ribosomes  have  a  diameter  of  25-30  nm  and  are  made 
up  of  two  subunits.  Each  ribosome  consists  of  one  RNA  molecule  per  subunit  and  a 
total  of  more  than  80  protein  molecules.  Ribosome  structure  and  function  are  dis- 
cussed in  detail  in  another  volume  in  this  series  [14]. 

To  understand  how  biomacromolecules  function  as  molecular  machines,  their 
structures,  interactions  with  other  molecules,  and  physical  properties  must  be 
explored  and  defined  experimentally,  using  the  experimental  approaches  described 
above  and  others  that  lie  outside  the  scope  of  this  volume.  This  information 
enables  the  development  of  models  that  predict  how  the  molecule  functions.  The 
fidelity  with  which  the  model  predicts  the  behavior  of  the  molecule  can  then  be 
evaluated  in  additional  experiments.  When  sufficient  information  is  available,  a 
quantitative  model  with  defined  physical  and  chemical  parameters  can  be  con- 
structed and  used  to  simulate  mathematically  the  properties  of  the  actual  biomac- 
romolecule.  A  successful  mathematical  model  must  not  only  account  for  the 
original  experimental  data  but  must  also  provide  predictions  that  can  be  validated. 
The  predictive  quality  is  a  measure  of  the  fundamental  understanding  embodied 
in  the  model.  The  physical  chemical  concepts  that  enable  quantitative  models  to 
be  developed  are  reviewed  in  Chap.  2.  Repeated  cycles  of  experiment,  modeling, 
and  testing  the  model  against  additional  experiments,  often  by  multiple  teams  of 
investigators,  lead  to  an  increasingly  detailed  and  accurate  understanding  of  how 
the  molecular  machine  functions  in  the  cell. 


1.6    Examples  of  Molecular  Machines 

To  illustrate  these  principles,  Chaps.  9-1 1  focus  on  three  families  of  biomolecular 
machines  and  demonstrate  how  results  obtained  with  several  different  research 
approaches  have  led  to  our  current  understanding  of  how  these  biomolecular 
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Fig.  1.9  Structure  of  bovine  mitochondrial  Fl  ATPase.  The  native  protein  is  a  hexamer  with  six 
subunits,  three  of  which  are  shown  in  gray  and  three  in  yellow.  The  blue  helices  correspond  to  part 
of  the  y-subunit  which  connects  the  membrane  embedded  rotary  motor  to  the  catalytic  core  (PDB 
accession  number  1E1R) 


machines  work  and  their  roles  in  disease.  Chapter  9  focuses  on  enzymes  called 
helicases  that  partially  unwind  the  double  helix  of  the  DNA  molecule,  enabling  it  to 
replicate.  Figure  1.8  shows  an  example.  Membrane  bound  transport  ATPases, 
enzymes  that  use  the  chemical  energy  made  available  by  hydrolyzing  ATP  (adenos- 
ine triphosphate)  to  transport  molecules  across  cell  membranes,  are  the  subject  of 
Chap.  10.  Figure  1.9  shows  the  structure  of  one  membrane  ATPase.  Chapter  11 
discusses  the  myosin  family  of  motor  proteins  that  are  involved  in  both  muscle  con- 
traction and  cellular  motility  and  transport.  The  structure  of  part  of  the  actin-myosin 
complex  is  shown  in  Fig.  1.7. 
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1.7    Future  Prospects 

While  biophysics  has  had  a  brilliant  past,  the  future  appears  even  brighter. 
Investments  in  instrument  development  have  streamlined  experiments,  making 
them  much  less  arduous,  and  the  development  of  molecular  biology  and  recombi- 
nant DNA  technology  has  greatly  increased  the  ease  with  which  molecules  can  be 
identified,  isolated,  and  manipulated.  Genomics  has  opened  a  path  for  studying  not 
only  the  major  macromolecules  of  the  cell  but  also  molecules  never  before  identi- 
fied. Proteomics  provides  new  approaches  for  studying  the  macromolecular  interac- 
tions that  are  so  critical  to  cell  function,  and  the  development  of  cell  biology  enables 
the  linkages  between  molecular  and  cellular  structure  and  function  to  be  explored  in 
detail.  Ever  increasing  computational  power  supports  all  of  these  endeavors  and 
enables  the  development  of  massive  databases  that  can  be  mined  for  new  correla- 
tions and  insights. 

As  our  understanding  has  grown,  the  relevance  and  importance  of  biophysics  to 
solving  real  world  problems  in  medicine,  industry,  agriculture,  forensics,  and  many 
other  fields  have  become  increasingly  clear.  In  addition  to  pushing  back  frontiers 
within  the  discipline,  there  is  now  a  pressing  need  and  responsibility  to  communi- 
cate the  contributions  that  biophysics  can  make  not  only  to  other  areas  of  science 
but  also  to  other  fields.  Fortunately,  the  clarity  of  the  fundamental  principles  that 
have  emerged  as  a  result  of  decades  of  dedicated  work  by  highly  trained  physical 
scientists  and  mathematicians  makes  them  highly  accessible  to  professionals  in 
other  fields  and  to  the  general  public. 

Future  Prospects,  the  final  chapter  of  this  volume  (Chap.  12),  focuses  on  the 
future  of  molecular  biophysics  and  three  of  the  major  technologies  discussed  in  this 
volume — X-ray  diffraction  and  neutron  scattering,  mass  spectrometry,  and  protein 
therapeutics.  As  always,  new  and  advancing  technologies  and  the  development  of 
new  interdisciplinary  foci  are  among  the  major  drivers.  The  chapter  concludes  with 
a  discussion  of  the  opportunities  and  challenges  of  translating  biophysical  discover- 
ies into  solutions  to  real- world  problems,  as  illustrated  in  drug  and  pharmaceutical 
development. 


1.8    Other  Volumes  in  this  Series 

This  book  is  a  foundational  volume  for  the  series  Biophysics  for  the  Life  Sciences, 
published  by  Springer,  a  series  intended  for  students  and  professionals  from  other 
fields.  Other  volumes  in  this  series  deal  with  specific  areas  of  molecular  or  cellular 
biophysics.  Volumes  in  print  focus  on  translational  control  of  gene  expression, 
single  molecule  studies  of  proteins,  therapeutic  protein  development,  and  RNA 
folding  [13-16]. 
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Chapter  2 

Structural,  Physical,  and  Chemical  Principles 

Norma  M.  Allewell,  Linda  O.  Narhi,  and  Ivan  Rayment 


Abstract  This  chapter  introduces  the  structural,  physical,  and  chemical  foundations 
of  molecular  biophysics.  The  section  on  structure  describes  the  major  molecular 
components  of  the  cell  (water,  metabolites,  and  macromolecules)  and  discusses  the 
three-dimensional  macromolecular  structure,  folding,  and  assembly  of  macromol- 
ecules. The  section  on  molecular  thermodynamics  and  kinetics  includes  energies, 
equilibria,  and  rate  constants.  These  concepts  are  applied  to  protein  folding  and 
aggregation  and  illustrated  with  examples  relevant  to  the  development  of  protein 
pharmaceutics.  The  chapter  closes  with  a  discussion  of  the  interplay  between 
molecular  structure  and  energetics  and  biological  function. 

Keywords  Biomolecular  structure  •  Protein  folding  •  Macromolecular  assembly  • 
Thermodynamics  •  Energetics  •  Kinetics  •  Free  energy  •  Equilibrium  constant  •  Rate 
constant  •  Protein  aggregation  •  Protein  pharmaceutics  •  Structure-function  relationships 


Investigating  and  understanding  biophysical  properties  and  structural  problems  in 
the  life  sciences  requires  integrating  structural,  physical,  and  chemical  information. 
In  acquiring  and  interpreting  this  information,  biophysicists  rely  heavily  on  the 
theoretical  frameworks  and  experimental  approaches  of  physics  and  chemistry. 
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Fully  exploring  the  breadth  and  depth  of  these  disciplines  is  far  beyond  the  scope  of 
this  volume.  The  goal  of  this  chapter  is  simply  to  provide  nonspecialists  with  the 
core  vocabulary  and  background  in  principles  and  concepts  that  will  be  needed  to 
utilize  subsequent  chapters  in  this  volume  and  other  volumes  in  this  series.  Some 
widely  used  textbooks  are  listed  in  the  references  [1-6]. 


2.1    The  Molecules  of  the  Cell 

The  major  chemical  components  of  the  cell  fall  into  a  few  major  categories — water 
ions,  metabolites,  lipids,  and  macromolecules.  Most  molecules  in  the  cell  contain 
at  most  six  different  atoms,  including  carbon,  hydrogen,  nitrogen,  oxygen,  phos- 
phorus, and  sulfur.  Water  is  the  most  abundant  molecule  in  the  cell  and  exerts 
profound  effects  on  the  physicochemical  properties  and  biological  function  of  all 
other  molecules.  The  most  abundant  ions  in  a  typical  mammalian  cell,  listed  in 
order  of  their  intracellular  concentrations  are  potassium,  bicarbonate,  sodium,  and 
chloride,  followed  by  magnesium  and  calcium,  which,  although  present  in  much 
lower  concentrations,  have  critically  important  biological  functions.  Metabolites 
are  small  molecules,  produced  by  partially  digesting  foods.  Depending  upon  the 
needs  of  the  cell,  these  are  either  broken  down  completely  to  produce  carbon  diox- 
ide, water,  and  ammonia,  or  used  to  synthesize  other  small  molecules  or  macro- 
molecules  that  are  required  by  the  organism.  One  class  of  metabolites  that  is 
particularly  important  is  lipids,  oily  molecules  of  intermediate  size,  generally  in 
the  range  of  a  few  hundred  Daltons,  that  have  several  roles  in  the  cell.  Two  of  the 
most  important  roles  are  serving  as  major  components  of  all  cell  membranes  and 
as  major  reserves  of  chemical  energy. 

The  macromolecules  of  the  cell,  with  molecular  weights  in  the  range  of  several 
thousand  to  millions  of  Daltons  are  formed  by  linking  together  small  molecules, 
often  called  building  blocks,  to  form  polymers.  Proteins  are  formed  by  linking 
twenty  different  amino  acids  together  in  a  sequence  specific  to  a  given  protein  to 
form  a  linear  polymer  (Fig.  2.1).  The  corresponding  building  blocks  for  the  nucleic 
acids,  DNA  (deoxyribonucleic  acid)  and  RNA  (ribonucleic  acid),  are  four  nucleo- 
tides, each  of  which  has  three  elements — an  aromatic  base,  characteristic  of  a  given 
nucleotide;  a  sugar;  and  a  phosphate  group  (Fig.  2.2).  Three  of  the  four  bases  in 
RNA  and  DNA  are  identical,  while  the  fourth  differs  by  one  functional  group.  The 
sugar  groups  in  DNA  and  RNA  also  differ  by  one  functional  group.  Although  these 
chemical  differences  between  DNA  and  RNA  may  appear  minor,  they  have  major 
biological  consequences.  Polysaccharides  are  formed  by  linking  together  many 
sugar  groups  from  a  defined  set  of  molecules,  often  in  a  branched  structure  (Fig.  2.3). 
Glycogen,  a  polymer  of  glucose,  is,  together  with  the  lipids  mentioned  above,  a 
major  reservoir  of  chemical  energy  in  the  cell. 

Different  kinds  of  molecules  also  combine  to  increase  the  chemical  and  biologi- 
cal repertoire  of  the  cell.  For  example,  glycolipids  have  both  lipid  and  sugar 
components,  while  glycoproteins  have  both  protein  and  sugar  components  (Fig.  2.4). 
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Fig.  2.1  Proteins  are  linear 
condensation  polymers  of 
amino  acids.  The  amino  acids 
are  connected  by  amide 
linkages  or  "peptide  bonds." 
The  structure  and  function  of 
the  protein  is  governed  by  the 
sequence  of  amino  acids  in 
the  polypeptide  chain 
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Fig.  2.2  Building  blocks  of  DNA  and  RNA.  DNA  and  RNA  both  utilize  four  bases  to  convey 
genetic  information.  Three  of  these  are  in  common,  but  differ  in  the  incorporation  of  thiamine  in 
DNA  and  uracil  into  RNA.  In  addition,  the  sugar  in  DNA  is  deoxyribose,  while  the  sugar  in  RNA 
is  ribose 


2.2    The  Importance  of  Three-Dimensional  Structure 

While  the  chemical  properties  of  small  molecules  can  be  inferred  from  their  covalent 
chemical  structure  alone,  the  physical,  chemical,  and  biological  properties  of  biological 
macromolecules  depend  not  only  on  their  covalent  chemical  structure  but  also  on  the 
three-dimensional  shape  (conformation)  that  they  assume  within  the  cell.  The  double 
helical  structure  of  DNA,  first  proposed  by  Watson  and  Crick,  is  an  outstanding  example 
(Fig.  2.5).  As  many  readers  will  be  aware,  the  double  helix  consists  of  two  DNA 
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Fig.  2.3  Carbohydrates  are  the  most  abundant  macromolecules  in  the  biosphere.  These  form  the 
basis  for  polysaccharides  which  consist  of  polymers  of  sugar  moieties.  An  immense  variety  of 
building  blocks  are  available  that  lead  to  linear  or  branched  chains 


Protein  glycosylation:  combinational 
attachment  of  carbohydrates 


Fig.  2.4  Complex  biological  molecules  are  constructed  from  small  building  blocks.  Carbohydrates, 
lipids,  and  proteins  are  frequently  combined  to  create  larger  macromolecules.  This  combinatorial 
approach  dramatically  increases  the  biological  functions  that  can  be  accomplished  with  a  small 
number  of  chemical  templates.  Glycolipids  are  key  components  of  membranes  whereas  glycosyl- 
ation of  proteins  creates  a  population  of  proteins  with  differing  characteristics 
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B-DNA 


Fig.  2.5  Three-dimensional  structure  of  DNA.  B-DNA  forms  a  double  helix  whose  conformation 
is  dictated  in  part  by  hydrogen  bonds  between  complementary  pairs  of  bases  (RCSB  accession 
number  1D29) 

molecules,  held  together  by  noncovalent  interactions.  This  structure  is  integral  to  the 
transmission  of  genetic  information  within  cells  and  to  the  next  generation. 

Most  proteins  have  complex  three-dimensional  structures  that  form  as  a  result  of 
the  folding  of  the  chain  of  linked  amino  acids  in  the  protein.  The  way  in  which  the 
protein  folds  is  determined  by  noncovalent  interactions  between  the  chemical 
groups  in  the  protein  including  the  peptide  backbone  and  the  side  chains  of  the 
amino  acids,  as  well  as  by  the  interaction  of  the  protein  with  molecules  and  ions  in 
the  environment.  The  folded  protein  has  a  unique  structure  resulting  in  specific  sur- 
face properties  that  enable  the  protein  to  recognize  and  interact  selectively  with 
specific  molecules  within  the  cell  (Fig.  2.6).  Examples  include  the  interactions  of 
enzymes  with  substrates,  actin  with  myosin,  and  antibodies  with  antigens. 

The  three-dimensional  structures  of  RNA  share  similarities  with  both  DNA  and 
proteins.  In  contrast  to  DNA,  most  RNA  molecules  are  single  stranded.  However, 
turns  or  hairpin  bends  in  the  polynucleotide  strand  enable  short  double  helical  seg- 
ments to  form,  similar  to  those  found  in  DNA  (Fig.  2.5).  In  addition,  other  bends 
result  in  the  overall  structure  of  the  molecule  being  fairly  compact,  as  most  protein 
molecules  are.  The  three-dimensional  structures  of  RNA  molecules  sometimes 
enable  them  to  bind  cognate  molecules  selectively  and  specifically  and  to  catalyze 
chemical  reactions,  just  as  proteins  do. 

The  three-dimensional  structures  of  the  other  macromolecules  of  the  cell  tend  to 
be  less  well  defined  than  those  of  the  proteins  and  nucleic  acids,  because  molecular 
recognition  is  not  one  of  their  major  functions.  For  example,  glycogen  exists  in 
disordered  granules  in  cells  and  fats  exist  as  lipid  droplets.  On  the  other  hand,  their 
characteristic  chemical  features  enable  them  to  be  recognized  by  the  proteins  with 
which  they  interact. 
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Fig.  2.6  Proteins  adopt  structures  that  are  complementary  to  their  ligands.  To  a  first  approxima- 
tion most  proteins  adopt  a  unique  fold  that  is  essential  for  function.  Furthermore,  all  proteins  have 
a  function  and  hence  interact  with  some  other  component  of  the  biological  milieu.  In  this  case  the 
binding  site  and  the  ligand  exhibit  complementarity  as  seen  here  for  the  binding  of 
(A^-acetylglucosamine)3  to  hen  egg  white  lysozyme  (RCSB  accession  number  1LZB).  The  cartoon 
ribbon  representation  (left)  reveals  the  path  of  the  polypeptide  chain  and  major  secondary  struc- 
tural features  that  define  the  fold.  The  electrostatic  surface  (right)  shows  the  magnitude  and  nature 
of  the  active  site  cleft.  Positive  and  negative  potential  are  depicted  in  blue  and  red  respectively 

The  structures  of  macromolecules  are  not  rigid,  and  cannot  be  defined  by  a  single 
set  of  coordinates,  because  the  forces  that  shape  the  three-dimensional  structures  are 
weak.  The  flexibility  of  biological  molecules  is  essential  to  their  biological  function 
for  two  reasons.  First,  it  enables  the  molecules  to  function  as  macromolecular 
machines,  and,  secondly,  it  enables  the  molecules  and  hence  the  cell  and  the  organism 
to  respond  to  their  environment.  The  degree  of  flexibility  spans  a  wide  range,  depend- 
ing on  the  number  and  configuration  of  the  intra-  and  intermolecular  interactions  that 
maintain  the  structure,  as  well  as  external  conditions.  In  conditions  similar  to  those 
that  exist  in  the  cell,  most  DNA  and  protein  molecules  are  quite  stable,  completely 
unwinding  or  unfolding  in  solution  only  when  the  temperature  is  raised  or  chemicals 
that  disrupt  the  native  structure  are  introduced.  At  the  same  time,  because  the  intra- 
molecular forces  are  weak,  macromolecular  structures  are  in  constant  motion. 
Generally  these  rapid  structural  fluctuations  are  integrally  related  to  function.  The 
frequency  and  scale  of  the  fluctuations  span  a  wide  range,  with  the  structure  of  RNA 
generally  being  more  fluid  than  that  of  double-stranded  DNA  and  proteins. 


2.3    Macromolecular  Folding  and  Assembly 

Biological  macromolecules  have  a  remarkable  ability  to  self-assemble.  The  same 
forces  that  drive  individual  macromolecules  to  assume  their  three-dimensional 
structures  also  allow  macromolecules  to  assemble  into  larger  structures.  For  example, 
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proteins  interact  with  DNA  to  form  chromatin  and  with  RNA  to  form  ribosomes. 
Lipids  assemble  into  bilayers  studded  with  membrane  proteins  in  the  cell  membrane 
and  in  the  intracellular  membranes  that  form  the  boundaries  of  cellular  organelles 
such  as  mitochondria.  Proteins  assemble  with  other  proteins  to  form  multisubunit 
enzymes,  the  cytoskeleton  of  the  cell,  muscle  fibers,  virus  capsules,  and  many  other 
specialized  structures. 

To  understand  the  forces  that  drive  both  the  folding  and  assembly  of  macromol- 
ecules,  we  need  to  delve  a  little  deeper  into  the  chemical  structures  of  their  macro- 
molecular  components.  The  kinds  of  noncovalent  interactions  that  can  occur  are 
limited  in  number  because  most  biological  molecules  contain  only  six  elements: 
carbon,  hydrogen,  nitrogen,  oxygen,  sulphur,  and  phosphorus.  Hydrogen  bonds 
form  when  two  electronegative  atoms  (N  or  O)  interact  with  each  other  by  sharing 
a  single  hydrogen  atom.  As  an  example,  hydrogen  bonds  between  complementary 
bases  link  the  two  strands  of  the  DNA  double  helix.  Hydrophobic  interactions  form 
when  nonpolar  groups,  made  up  solely  of  carbon  and  hydrogen  atoms,  cluster  to 
shield  themselves  from  water,  in  the  same  way  that  oil  forms  droplets  in  water. 
Ionic  bonds  involve  interactions  between  oppositely  charged  groups — for  example, 
negatively  charged  phosphate  groups  and  positively  charged  amino  groups.  Stacking 
interactions  occur  between  aromatic  rings — for  example,  between  the  bases  of 
nucleic  acids. 

When  individual  macromolecules  fold  or  macromolecules  assemble  to  form  a 
larger  structure,  they  do  so  in  a  way  that  maximizes  these  interactions.  For  example, 
the  double  helical  structure  of  DNA  maximizes  the  number  of  hydrogen  bonds  and 
stacking  interactions  that  form,  as  shown  in  Fig.  2.5.  Similarly,  the  helices  and 
pleated  sheets  of  proteins  are  rich  in  hydrogen  bonds,  while  hydrophobic  groups 
tend  to  cluster  in  the  interior  of  proteins  where  they  are  shielded  from  water.  When 
charged  groups  occur  in  the  interior  of  proteins,  groups  with  opposite  charges  are 
almost  always  paired  in  ionic  bonds  (salt  bridges). 

The  surfaces  that  come  together  when  macromolecules  assemble  together  to 
form  larger  structures  tend  to  have  complementary  charged  or  nonpolar  surfaces. 
Thus  a  positively  charged  region  on  one  surface  will  be  matched  by  a  negatively 
charged  region  on  the  opposing  surface,  while  nonpolar,  hydrophobic  regions  will 
be  positioned  so  that  they  come  together  when  the  two  surfaces  meet. 

The  weak  forces  that  hold  macromolecular  assemblies  together  allow  some  to 
act  as  molecular  switches  that  change  the  way  they  assemble  when  the  external 
environment  changes.  Since  structure  is  so  tightly  coupled  to  function,  any  change 
in  structure  will  most  likely  also  result  in  a  change  in  function  that  enables  the 
organism  to  respond  to  change  in  its  external  environment.  One  of  the  first  molecu- 
lar switches  to  be  discovered  was  hemoglobin,  the  protein  that  binds  oxygen  in  the 
lung  and  releases  it  in  the  tissues.  In  the  high  oxygen  environment  of  the  lungs,  the 
four  subunits  of  hemoglobin  assemble  in  one  mode  that  binds  oxygen  strongly.  In 
the  low  oxygen  environment  of  the  tissues,  the  subunits  assemble  in  a  different 
mode  that  binds  oxygen  weakly,  allowing  it  to  be  released  into  the  tissues.  Molecular 
switches  are  now  recognized  to  be  ubiquitous  and  to  play  major  roles  in  both  health 
and  disease. 
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2.4    Thermodynamic  and  Kinetic  Principles 

Knowing  the  structure  of  a  macromolecule  is  only  the  first  step  in  understanding  the 
physical  basis  of  its  function.  Advancing  to  the  next  level  requires  defining  the  cellular 
processes  in  which  the  macromolecule  participates  in  chemical  terms.  This  includes 
not  only  identifying  all  of  the  reactants,  intermediates,  and  products  but  also  defin- 
ing the  thermodynamic  and  kinetic  features  of  each  step  of  the  process,  that  is,  the 
differences  in  energy  between  reactants  and  products  (thermodynamics)  and  the  rates 
of  both  the  forward  and  reverse  reactions  (kinetics).  When  these  quantities  are 
defined,  the  amounts  of  reactant  and  product  that  will  be  present  under  different 
conditions  and  the  overall  rate  at  which  reactant  will  be  converted  to  product  under 
different  conditions  can  be  predicted. 


2.4.1    Energies,  Equilibrium  Constants,  and  Rates 

Since  molecular  biophysics  emphasizes  reactions  carried  out  in  solution,  generally 
at  constant  temperature  and  pressure,  the  three  thermodynamic  parameters  of  great- 
est interest  are  often  the  Gibbs  free  energy  (named  after  the  great  nineteenth  century 
mathematician  and  physical  chemist  Willard  Gibbs),  enthalpy,  and  entropy. 

When  a  chemical  reaction  occurs,  the  heat  released  or  absorbed  corresponds  to 
the  difference  in  the  enthalpy  of  products  and  reactants.  The  difference  in  the  degree 
of  disorder  of  products  and  reactants  corresponds  to  the  change  in  the  entropy  of  the 
reaction.  The  change  in  Gibbs  free  energy  (AG)  in  the  reaction  is  related  to  the 
changes  in  enthalpy  (AH)  and  entropy  (AS)  as  described  by  the  following 
equation: 

AG  =  AH  -  TAS  (2.1) 

The  sign  of  the  change  in  the  free  energy  of  a  reaction  indicates  whether  a  reac- 
tion is  spontaneous  or  occurs  only  with  the  input  of  external  energy.  A  negative 
change  in  free  energy  corresponds  to  a  spontaneous  reaction  that  does  not  need  to 
be  driven  by  the  input  of  energy.  Conversely  a  reaction  with  a  positive  change  in  free 
energy  will  not  occur  without  the  input  of  energy. 

The  sign  of  the  enthalpy  change  indicates  whether  heat  is  released  or  absorbed  in 
a  reaction.  A  negative  change  in  enthalpy  indicates  that  the  reaction  results  in  the 
release  of  heat.  As  the  equation  indicates,  a  negative  change  in  enthalpy  contributes 
to  a  more  negative  free  energy,  and  thus  makes  the  reaction  more  favorable. 

A  negative  change  in  entropy  indicates  that  the  products  are  more  ordered  than 
the  reactants,  and  also  results  in  a  more  positive  change  in  free  energy.  Thus  any 
increases  in  order  make  a  reaction  less  favorable. 

Many  chemical  reactions  are  readily  reversible,  with  reactants  and  products  coex- 
isting. The  ratio  of  the  concentration  of  product  to  reactant  defines  the  equilibrium 
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constant  (K).  In  the  simplest  case,  in  which  A  is  the  reactant  and  B  is  the  product, 
the  equilibrium  constant  is  defined  as  the  ratio  of  the  molar  concentrations  of  [B] 
and  [A]: 

K  =  \B\/\A\  (2.2) 

Equilibrium  constants  and  standard  free  energies  are  related  by  the  equation: 

AG°  =  -RTlnK  (2.3) 

where  AG°  is  the  change  in  free  energy  change  at  standard  concentrations  of  reac- 
tants  and  products  and  standard  temperature  and  pressure.  R  is  a  universal  constant 
called  the  gas  constant  because  it  is  defined  by  the  equation  for  ideal  gases  (PV=  nRT, 
where  n  is  the  number  of  moles,  T  is  the  ideal  gas  temperature  in  degrees  Kelvin 
(K),  Vis  volume  of  the  system,  and  P  is  pressure.  R  has  a  value  of  8.31  J  K-1  mol-1. 
For  biochemical  reactions  in  solution,  the  standard  concentration  is  typically  1  M, 
the  standard  temperature  is  298.15  K,  the  standard  pressure  is  1  atm  and  R  has  a 
value  of  1.98  x  10"3  kcal-K^-M"1). 

The  free  energy  change  at  other  concentrations  of  reactants  and  products  is 
given  by 

AG  =  AG°  +  R  Tin  [b]/[A]  (2.4) 

While  thermodynamics  defines  the  energies  of  the  components  of  a  biophysical 
system,  rate  constants  define  the  rates  of  the  reactions  that  link  them.  For  example, 
two  rate  constants  define  the  rates  of  the  interconversion  of  A  and  B:  kx  the  rate 
constant  for  converting  A  to  B,  and  k_u  the  rate  constant  for  converting  Bio  A. 

[*M*1 

In  the  absence  of  B,  the  rate  at  which  A  is  converted  to  B  is  given  by  &i[A],  and, 
in  the  absence  of  A,  the  rate  at  which  B  is  converted  to  A  is  given  by  k_i[B]. 

When  A  and  B  coexist,  the  net  rate  of  change  of  [A]  will  be  the  difference  in  the 
rates  of  the  forward  and  reverse  reactions: 

d[A]/dt=k_1  [B]-kx  [A] 
When  d[A]/df =0,  k_l[B]=kl[A] 

However,  K,  the  equilibrium  constant  for  the  reaction,  equals  [#]/[A]  and 


K  =  k1/k_1 


(2.5) 
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2.5    An  Example:  The  Energetics  of  Protein  Folding 

To  illustrate  how  thermodynamics  and  kinetics  are  helpful  in  understanding  the 
behavior  of  macromolecules,  let's  consider  the  case  of  protein  folding.  This  simplest 
reaction  pathway  can  be  represented  as 

UhF 

where  U  is  the  unfolded  protein,  F  is  the  folded  protein,  kx  is  the  rate  constant  for 
folding,  and  k_x  is  the  rate  constant  for  unfolding.  The  rate  constants  for  this  equa- 
tion have  units  of  seconds-1. 

The  equilibrium  constant  for  the  folding  reaction  will  be  given  by 

*  =  [F]/[U] 

where  the  fraction  indicates  the  ratio  of  the  concentrations  of  folded  and  unfolded 
protein.  Often  this  ratio  can  be  determined  experimentally. 

Let's  suppose  that  this  ratio  is  1,000,  meaning  that  there  are  1,000  times  as  many 
folded  protein  molecules  in  solution  under  the  conditions  of  the  experiment  as  there 
are  unfolded  molecules.  Then  the  value  of  the  equilibrium  constant  under  those 
conditions  will  also  be  1,000. 

With  the  equilibrium  constant  known,  the  value  of  the  change  in  free  energy  can 
also  be  calculated,  using  the  following  values: 

R  =  1 .98  x  10-3  kcal-K^-M-1 

7=298  K 

tf  =  1,000;  log^=3;  ln^=2.303(log/Q 

AG°  =  -RJlnK=  -( 1 .98  x  10-3)(298)(2.303)(3)  =  -4.08  kcal/M 

As  expected,  the  change  in  the  standard  free  energy  is  negative,  since  protein 
folding  is  a  spontaneous  reaction. 

With  the  standard  free  energy  change  known,  the  next  question  is,  what  are  the 
values  for  the  changes  in  enthalpy  and  entropy?  Does  folding  a  protein  consume  or 
release  heat?  Does  the  system  (which  includes  both  the  protein  and  the  solvent  in 
which  it  is  dissolved)  become  more  ordered  or  disordered  when  the  protein  folds? 

Enthalpy  changes  for  temperature-induced  chemical  changes  can  often  be  deter- 
mined experimentally  using  instruments  called  scanning  calorimeters,  in  which  a 
solution  is  gradually  heated,  and  the  heat  lost  or  gained  by  thermally  induced  pro- 
cesses is  measured.  Calorimeters  can  readily  be  used  to  study  protein  unfolding, 
since  most  proteins  unfold  when  they  are  heated.  Let's  suppose  that  the  molar 
enthalpy  change  for  unfolding  the  protein  is  10  kcal  mol-1,  indicating  that  10  kcal  of 
heat  are  absorbed  when  one  mole  of  protein  unfolds.  Since  folding  is  the  reverse  of 
unfolding,  10  kcal  of  heat  will  be  released  when  one  mole  of  protein  folds,  and  the 
enthalpy  change  for  folding  is  -10  kcal  mol-1. 
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Next,  let's  insert  the  values  for  the  standard  free  energy  and  enthalpy  into  the 
Gibbs  equation: 

AG°  =  AH°  -  TAS° 

AS°  =  -1/T(AG°-AH°) 
=  -l/298(-4.08  +  10) 
=  -19.9  cal/K/M 

The  negative  entropy  change  indicates  that,  in  this  case,  protein  folding  results  in 
a  decrease  in  the  entropy  of  the  system  (protein  plus  solution),  or  an  increase  in 
order.  The  most  likely  contributors  to  the  increase  in  order  are  the  folding  of  the 
protein  and  the  ordering  of  solvent  and  ions  around  the  folded  protein. 


2.6    Applications  to  Protein  Pharmaceutics 

The  above  reaction  and  equations  represent  the  simplest  case  of  protein  folding,  and 
the  most  common  situation  in  vivo.  Kinetic  and  thermodynamic  theory  encom- 
passes many  other  scenarios  and  many  other  questions.  As  the  complexity  of  the 
reaction  under  study  increases,  the  equations  become  more  complex.  However,  the 
principles  remain  the  same.  More  complex  situations  involving  not  only  equilibrium 
unfolding  but  also  the  formation  of  intermediate  states  and  irreversible  interactions 
also  occur,  both  in  vivo  and  in  vitro.  A  common  unfolding  intermediate  that  is  seen 
for  many  proteins  under  mildly  denaturing  conditions  is  the  "molten  globule"  state, 
first  defined  by  Wada  and  Ohgushi  in  1983  [7].  A  molten  globule  is  defined  as  an 
intermediate  state,  which  has  lost  its  native  tertiary  structure  and  has  increased  sur- 
face hydrophobicity,  but  retains  native  secondary  structure.  Depending  on  the  condi- 
tions involved  formation  of  this  intermediate  can  be  a  reversible  reaction  with  the 
native  structure,  or  can  irreversibly  associate  resulting  in  larger  aggregated  species. 

Another  example  of  complex  protein  folding,  unfolding,  and  self-association 
reactions  occurring  in  vivo  is  the  formation  of  amyloid  fibers.  This  irreversible 
structure  is  characterized  by  formation  of  intermolecular  beta  sheet  secondary  struc- 
ture (regardless  of  the  native  secondary  structure  of  the  monomer),  and  is  defined  by 
the  -4.7  A  and  10  A  signatures  of  the  cross-P  diffraction  pattern.  Amyloid  forma- 
tion is  initiated  by  a  nucleation  event  believed  to  start  with  an  unfolded  intermediate 
species  that  results  in  a  lag  phase  prior  to  formation  of  the  protein  aggregate  [8]. 
This  phenomenon  was  originally  thought  to  occur  for  only  a  few  specific  proteins, 
but  it  has  subsequently  been  demonstrated  to  occur  for  most  proteins  under  some 
conditions.  Amyloid  formation  has  been  implicated  in  several  diseases,  including 
Alzheimer's  disease. 

The  unfolding  of  multi-domain  proteins,  such  as  the  monoclonal  antibodies  com- 
monly used  as  protein  therapeutics,  is  even  more  complex,  with  multiple  intermedi- 
ates and  irreversible  aggregated  species.  The  formation  of  intermediates  is  usually 
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Fig.  2.7  Schematic  of  potential  protein  unfolding  reactions 


reversible,  followed  by  irreversible  self-association.  Unfolding  of  a  multi-domain 
protein  can  be  cooperative  with  all  domains  unfolding  together,  in  which  case  it  can 
be  modeled  as  a  single  domain.  Alternatively  each  domain  can  unfold  indepen- 
dently, resulting  in  multiple  intermediates  with  different  combinations  of  folded  and 
unfolded  domains.  The  multistep  unfolding  and  aggregation  of  proteins  is  a  process 
that  can  be  described  by  the  Lumry-Eyring  framework  [9,  10].  Each  state  of  the 
protein  has  a  different  free  energy  of  activation,  and  also  a  different  colloidal  stabil- 
ity (propensity  for  self-association  under  the  conditions  of  the  surrounding  solution 
or  environment).  The  rate  limiting  step  is  the  one  with  the  highest  energy  barrier. 

As  proteins  are  isolated  and  studied,  the  types  of  solution  conditions  they  are 
exposed  to  also  increases,  many  of  which  can  have  deleterious  effects  on  the  native 
state  of  the  protein.  The  complexity  of  protein  folding,  unfolding,  and  self- 
association  is  indicated  in  Fig.  2.7,  a  result  of  the  work  described  in  [9,  10]  and 
others. 

Subsequent  chapters  will  further  develop  these  principles  and  provide  examples 
of  their  application  to  specific  biophysical  phenomena. 


2.7    Relating  Structure,  Energetics,  and  Function 

Over  millions  of  years,  the  structures  of  biological  macromolecules  have  evolved 
to  optimize  their  ability  to  perform  their  specific  biological  functions.  As  a  result, 
the  structure,  physical  properties,  and  function  of  biological  macromolecules  are 
intimately  related,  and  understanding  these  relationships  is  key  to  understanding 
how  biological  macromolecules  function  as  molecular  machines  in  living  cells. 
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As  we  explore  these  relationships,  the  kinds  of  questions  molecular  biophysicists 
ask  include  the  following: 

1.  What  is  the  relationship  between  the  chemical  structure  of  the  macromolecule 
and  its  three-dimensional  structure? 

2.  What  does  the  three-dimensional  structure  of  the  macromolecule  predict  about 
its  physical  and  chemical  properties  and  its  biological  function? 

3.  How  many  different  three-dimensional  structures  (states)  can  the  macromole- 
cule assume? 

4.  What  is  the  equilibrium  distribution  of  the  macromolecule  between  these  differ- 
ent states?  Which,  if  any,  represent  irreversible  reactions? 

5.  How  rapidly  do  the  states  interconvert? 

6.  How  does  this  distribution  of  states  change  when  the  macromolecule  interacts 
with  other  molecules  or  ions  in  the  cell? 

7.  How  do  the  functional  properties  of  the  different  states  differ? 

8.  How  do  these  differences  enable  the  macromolecule  to  perform  its  biological 
function?  When  do  they  impede  this  function? 

In  answering  these  questions,  contemporary  biophysics  makes  use  of  a  set  of 
sophisticated  physical  and  chemical  experimental  approaches  and  instrumentation 
that  was  introduced  in  Chap.  1,  and  will  be  the  subject  of  several  subsequent  chap- 
ters. These  studies  are  enriched  immensely  when  the  powerful  approaches  of 
molecular  biology  are  used  to  examine  the  effects  of  systematically  modifying 
chemical  structure,  and  computational  tools  are  used  for  in  depth  analysis  and  mod- 
eling. Chapters  9-11  will  illustrate  how  physical,  chemical,  molecular  biological, 
and  computational  tools  can  be  used  in  combination  to  develop  a  complete  under- 
standing of  three  very  different  biomolecular  machines  in  terms  of  their  structure, 
energetics,  and  function.  The  three  classes  of  molecules  to  be  examined  are  heli- 
cases,  proteins  that  unwrap  the  DNA  double  helix  to  allow  DNA  replication;  mem- 
brane ATPases,  which  convert  electrical  energy  into  chemical  energy:  and  a 
molecular  motor  (myosin).  Despite  the  very  different  biological  functions  of  these 
molecules,  all  three  operate  according  to  very  similar  physicochemical  principles, 
and  hence  the  strategies  used  to  investigate  their  functional  mechanisms  follow  a 
common  biophysical  theme. 
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Abstract  The  interaction  of  light  with  macromolecules  is  used  in  multiple  different 
ways  in  the  life  sciences.  These  interactions  can  be  exploited  to  learn  about  the 
structure,  stability,  and  function  of  proteins  and  nucleic  acids.  In  this  chapter  we 
cover  analysis  of  macromolecules,  primarily  at  equilibrium,  by  UV  absorbance, 
circular  dichroism,  fluorescence,  Fourier  transform  infrared,  and  Raman  spectros- 
copies, and  light  scattering.  A  brief  description  of  the  underlying  theory  and  some 
examples  of  applications  are  provided  for  each  technique. 

Keywords  UV  absorbance  •  Fluorescence  spectroscopy  •  Fourier  transform  infrared 
(FTIR)  spectroscopy  •  Raman  spectroscopy  •  Circular  dichroism  •  Light  scattering 
•  Spectroscopy 


3.1  Introduction 


3.1.1    Physical  Basis  of  Light  Spectroscopy 

The  explanation  of  the  nature  of  light  and  its  interactions  with  molecules  is  one  of 
the  basic  tenets  of  quantum  mechanics.  Light  is  a  rapidly  oscillating  electromagnetic 
field  that  can  be  described  as  either  a  wave  or  a  photon.  In-depth  discussion  of  the 
fundamentals  of  the  interactions  of  light  with  molecules  and  applications  of  these 
phenomena  for  the  analysis  of  macromolecules  are  included  in  most  physics  and 
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chemistry  textbooks,  and  in  multiple  reviews  on  the  subject  [1-6].  A  brief  summary 
of  some  of  the  basic  concepts  is  presented  here  so  that  the  principles  of  the  different 
types  of  analyses  discussed  in  this  chapter  can  be  understood.  The  energy  of  the 
photon  can  be  related  to  frequency  or  wavelength  by  the  equation  below: 


This  relation  between  the  energy  and  frequency  is  called  the  Planck  relation 
where  h  is  the  Planck  constant  (6.626  x  10~34  J/s),  c  is  the  speed  of  light  (3  x  108  m/s), 
v  is  the  frequency  (s_1)>  and  X  is  the  wavelength  (m). 

The  formation  of  a  molecular  bond  results  in  a  bonding  orbital,  which  is  the  low- 
est energy  orbital  and  is  occupied  by  spin  paired  electrons,  and  the  formation  of  an 
antibonding  orbital  that  is  at  a  higher  energy  and  usually  empty.  If  the  molecule  has 
an  odd  number  of  electrons  one  of  these  electrons  will  occupy  the  nonbonding 
orbital.  At  any  given  time  the  majority  of  molecules  in  a  population  exist  in  the 
ground  state,  with  the  electrons  in  the  bonding  orbital.  When  the  energy  of  an  inci- 
dent photon  is  equivalent  to  the  energy  gap  between  the  bonding  and  antibonding 
electron  orbitals,  the  electron  absorbs  this  photon  and  temporarily  makes  the  transi- 
tion from  the  lower  to  the  higher  energy  state  (from  the  HOMO  or  highest  occupied 
molecular  orbital  to  the  LUMO,  or  lowest  unoccupied  molecular  orbital),  and  the 
molecule  as  a  whole  undergoes  a  transition  from  the  ground  state  to  the  excited 
state.  The  electron  then  returns  to  the  HOMO,  and  the  molecule  returns  to  the 
ground  state,  by  one  of  many  different  pathways  including  the  generation  of  heat, 
solvent  quenching,  and  vibrational  modes  (non-radiative  processes),  or  fluorescence 
or  phosphorescence  (radiative  processes).  These  transitions  can  be  further  described 
by  the  spin  state  of  the  electrons;  the  singlet  state  where  they  have  opposite  spins  so 
that  the  total  spin  state  of  the  molecule  is  0,  or  the  triplet  state,  where  the  electrons 
have  parallel  spins  and  the  total  spin  state  is  1 .  This  is  often  visualized  in  a  Jablonski 
diagram,  shown  in  Fig.  3.1  [7],  and  is  discussed  briefly  in  Chap.  2  in  this  volume  as 
well  as  numerous  textbooks  [4,  8].  The  smaller  the  energy  gap  between  the  HOMO 
and  the  LUMO,  the  longer  the  wavelength  of  light  that  is  absorbed.  In  macromole- 
cules  (and  other  organic  molecules)  the  transition  from  the  iz  to  the  jc*  orbital  is  the 
transition  that  contributes  most  to  absorbance,  though  a  to  o*  transitions  can  occur 
as  well,  with  the  n  to  Ji*occurring  much  less  frequently.  Within  each  electronic  state 
there  are  also  vibrational  and  rotational  states  of  the  molecule.  The  distribution  of 
the  energy  of  the  electronic  excited  states  across  these  molecular  states  results  in 
broadening  of  the  peak  of  light  absorbed  from  a  line  spectrum  to  a  distribution,  quite 
often  a  Lorentzian  distribution,  as  the  electrons  transition  to  the  LUMO.  The  peak 
of  emitted  light  as  the  molecule  returns  to  the  ground  state  is  also  broadened. 

The  excited  states  of  molecules  usually  have  a  greater  charge  separation  than  the 
ground  state,  resulting  in  a  larger  dipole  moment.  Interactions  between  the  molecu- 
lar dipole  and  the  solvent  or  surrounding  environment  can  increase  or  decrease  the 
energy  needed  to  transfer  from  the  HOMO  to  LUMO  states,  affecting  the 
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Fig.  3.1  Jablonski  diagram  showing  absorbance  of  energy  with  excitation  of  the  electron  to  an 
excited  state,  with  multiple  pathways  for  dissipating  the  energy  as  the  electron  returns  to  the 
ground  state,  including  non-radiative  (internal  conversion,  vibrational  relaxation)  and  radiative 
(fluorescence  and  phosphorescence)  processes  (from  [1]) 


wavelength  of  light  absorbed  by  the  molecule  as  it  makes  this  transition.  The  intensity 
of  the  absorbance  is  also  dependent  on  the  number  of  double  bonds  that  are  con- 
nected, or  conjugated.  This  conjugation  of  the  carbon  bonds  lowers  the  energy  of 
the  %  to  ft*  transition,  resulting  in  an  increase  in  both  the  wavelength  and  intensity 
of  the  absorbance  peak. 

The  transition  dipole  moment  is  the  electric  dipole  moment  vector  associated 
with  the  interaction  of  the  molecule  with  the  electromagnetic  radiation  (light)  to 
which  it  is  exposed.  The  direction  of  the  vector  determines  the  polarization  of  the 
transition,  a  principle  that  is  very  useful  in  fluorescence  spectroscopy. 

In  biological  macromolecules  the  chromophores  which  have  absorbance  in  the 
visible  region  are  cof actors,  including  metal  containing  molecules  such  as  the  hemes 
in  cytochromes  and  hemoglobin,  or  those  like  the  flavins  and  carotenoids  which 
have  extended  7C  conjugation.  The  proteins  and  nucleic  acids  themselves  intrinsi- 
cally absorb  in  the  ultraviolet  and  infrared  regions  of  the  spectrum.  The  transition 
between  ground  and  excited  states  for  these  chromophores  is  influenced  by  the  envi- 
ronment in  which  they  are  located,  making  them  sensitive  probes  of  the  conforma- 
tion and  stability  of  the  molecules.  In  proteins,  the  peptide  backbone,  the  aromatic 
amino  acid  side  chains,  and  the  disulfide  bonds  are  the  primary  chromophores 
(Table  3.1).  The  peptide  backbone  absorbs  between  170  and  230  nm  [8,  9]  as  a 
result  of  two  major  electron  transitions,  the  %  to  ti*  transition  at  190-195  nm  and  the 
weaker  n  to  7i*  transition  occurring  at  210-220  nm.  The  %  to  7i*  transition  of  the  side 
chains  of  the  aromatic  amino  acids  is  responsible  for  the  signal  in  the  near-UV 
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Table  3.1  Electronic 
transitions  of  amino  acids 
that  contribute  to  absorbance 
spectra  [8,  9] 


Peptide  backbone 


Tryptophan  (Trp) 


Chromophore 


Transitions     Wavelengths  (nm) 


jcudjc*  190-195 
n  to  jt*  210-220 
jc  to  jc*  280 
292 

jc  to  jc*  276 

Shoulders  at  267,  280 
jc  to  jc*  Triplet,  250-270 


Tyrosine  (Tyr) 


Phenylalanine  (Phe) 


Cysteine  (Cys) 
Histidine  (His) 
Nucleic  acids 


jc  to  jc*  <260 
n  to  jc*  <260 


jc  to  jc*  260  (240-275) 


region.  The  indole  side  chain  of  tryptophan  (Trp)  has  the  most  intense  signal  with  a 
maximum  around  280  nm,  and  a  weaker  transition  at  292  nm.  Tyrosine  (Tyr)  has  a 
weaker  absorbance,  with  its  strongest  transition  occurring  at  276  nm  and  weaker 
transitions  appearing  as  shoulders  at  267  and  280  nm  in  the  absorbance  spectrum. 
Phenylalanine  (Phe)  has  an  even  weaker  absorbance  which  occurs  as  a  triplet  at 
250-270  nm  due  to  its  vibrational  states.  Disulfide  bonds  absorb  weakly  from  about 
260  nm  and  below,  and  the  His  imidizole  ring  has  a  weak  n  to  n*  transition  which 
can  also  contribute  to  the  intensity  of  the  protein  absorbance  in  this  part  of  the  spec- 
trum. Ribonucleic  acid  (RNA)  and  deoxyribonucleic  acid  (DNA)  both  absorb  light 
between  240  and  275  nm,  with  the  maximum  around  260  nm  originating  from  the  n 
to  7T*  transition  of  the  pyrimidine  and  purine  ring  systems. 

The  electron  transitions  resulting  from  interactions  between  these  chromophores 
and  light  of  specific  wavelengths  are  the  basis  for  all  the  spectroscopic  techniques 
discussed  in  this  chapter.  Both  the  absorbance  of  light  and  the  emitted  energy  as  the 
macromolecule  relaxes  back  to  its  energetic  ground  state  are  exploited  by  different 
technologies,  providing  information  on  the  conformation  of  proteins  and  nucleic 
acids  and  the  folding  and  unfolding  of  these  important  molecules. 


3.1.2    Theory  of  Light  Scattering 

Most  of  the  events  described  above  depend  on  the  absorbance  of  light  as  the  elec- 
trons are  excited  and  then  return  to  their  ground  state.  Another  important,  more 
complex  interaction  between  light  waves  and  macromolecules  results  in  light  scat- 
tering. Light  scattering  occurs  when  electrons  are  perturbed  by  the  same  frequency 
as  the  electric  field  of  the  incident  wave,  creating  an  induced  dipole  moment,  which 
becomes  a  source  of  electromagnetic  radiation  emitted  at  the  same  frequency  but 
different  angle  than  that  of  the  incident  light,  or  the  light  passing  through  the  solu- 
tion without  molecular  interactions  [10-12]. 
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When  the  size  of  the  scattering  molecule  is  significantly  smaller  than  the  wave- 
length of  the  incident  light,  Rayleigh  scattering  occurs.  The  scattering  for  a  single, 
small  isotropic  particle  can  be  expressed  as 

^  =  ^H1+C0S  °)  <3-2> 

i0      r  A 

where  /  is  the  intensity  of  scattered  light,  I0  is  the  intensity  of  the  original  incident 
light,  a  is  the  molecular  polarizability,  r  is  the  distance  between  the  point  of  obser- 
vation and  the  position  of  the  scattering  particle,  X  is  the  wavelength  of  the  incident 
light,  and  0  is  the  angle  between  the  incident  beam  and  the  direction  of  observation. 
In  reality,  proteins  are  studied  in  solutions.  For  a  dilute  ideal  solution  containing 
concentration  c  of  particles  of  molecular  weight  M,  scattering  can  be  quantified  by 
the  Raleigh  ratio  (R0).  The  advantage  of  the  Rayleigh  ratio  is  that  it  is  independent 
of  the  incident  light  intensity  and  the  distance  to  the  scattered  light  detector  (i.e., 
independent  of  I0  and  r). 


Re  =  KCM  (3.3) 


where  K  is  a  constant 


2n2nl  (dn/dc)2 
K  =  °-\    1    }  (3.4) 

and  N  is  Avogadro's  number,  n0  is  the  refractive  index  of  the  solvent,  and  dn/dc  is  the 
refractive  index  increment  and  X0  is  the  wavelength  of  the  light  in  vacuo.  Note  that  if 
the  refractive  indexes  of  the  solvent  and  of  the  polymer  are  equivalent  then  dn/dc  will 
be  zero  and  there  will  be  no  polarizability  and  therefore  no  scattered  light.  The  con- 
stant K  depends  only  on  the  solvent  properties,  on  X,  and  on  0.  Kis  therefore  a  system 
constant  that  is  independent  of  the  concentration  of  the  solution  and  the  molecular 
weight  of  the  polymer.  M  is  the  weight-averaged  molecular  weight  of  the  solute  pres- 
ent; all  of  the  self-associated  states  present  contribute  to  this  based  on  their  fraction 
of  the  total  weight.  This  can  be  used  to  determine  the  molecular  weight  of  the  protein 
being  analyzed  as  a  function  of  any  solution  condition,  as  long  as  the  protein  concen- 
tration is  low  enough  to  prevent  intermolecular  interactions  between  the  individual 
molecules  that  would  affect  the  scattering  behavior. 


3.2    Chapter  Overview 

In  this  chapter  we  will  discuss  absorbance,  circular  dichroism,  fluorescence,  Fourier 
transform  infrared  (FTIR),  Raman  spectroscopy,  and  light  scattering.  Each  section 
begins  with  a  brief  overview  of  supporting  theory  for  the  methodology,  followed  by 
examples  of  its  application  to  the  analysis  of  macromolecules,  primarily  under  equi- 
librium conditions. 
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3.2.1    Light  Absorbance 

3.2.1.1  Introduction 

Most  biological  macromolecules  have  absorbance  in  the  ultraviolet  region  (as 
described  in  the  introduction  to  this  chapter);  several  good  reviews  in  this  area  have 
been  published  recently  [13,  14],  and  this  is  the  subject  of  several  text  books  [1-6]. 

A  typical  absorbance  spectrum  is  taken  by  shining  light  of  varying  wavelengths 
through  a  solution  and  recording  the  intensity  of  the  energy  passing  through  the 
sample  (transmission)  or  the  amount  of  light  blocked  by  the  samples  (absorbance) 
as  a  function  of  wavelength.  An  example  of  the  absorbance  spectrum  of  a  protein  is 
shown  in  Fig.  3.2.  According  to  the  Lambert-Beer  law  (also  Beer's  law),  the  amount 
of  light  absorbed  is  directly  proportional  to  the  concentration  of  the  chromophores 
present: 


A  =  log 


10 


=  kl  =  eel 


(3.5) 


where  A  is  the  measured  absorbance,  70  is  the  original  intensity  of  the  light,  lis  the 
intensity  of  the  light  transmitted  through  the  sample,  and  /  is  the  distance  in  cm 
the  light  travels  through  the  material  (i.e.,  path  length);  k  is  the  absorption  coef- 
ficient, which  is  determined  for  each  molecule  based  on  Beer's  law;  c  is  the  molar 
concentration  of  the  chromophores,  and  e  (L/mol  cm)  is  the  absorbance  of  a  1  M 
solution  (molar  absorptivity)  of  the  chromophores,  often  referred  to  as  the  extinc- 
tion coefficient. 
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Fig.  3.2  Absorbance  spectrum  of  anti-streptavidin  (red)  and  second  derivative  spectrum  (blue). 
Provided  by  Feng  He,  Amgen,  Inc. 
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However,  the  actual  extinction  coefficient  as  well  as  the  wavelength  of  maximum 
absorbance  can  be  affected  by  both  the  environment  of  the  chromophores  in  the 
protein,  including  interactions  with  the  surrounding  amino  acid  residues,  the  sec- 
ondary structure,  and  the  amount  of  buffer  exposure.  The  buffer  solutions  com- 
monly used  for  protein  separation  and  characterization  often  have  absorption  that 
overlaps  with  that  of  the  peptide  bond  in  the  far  UV  region.  Because  of  these  back- 
ground interferences,  absorbance  at  these  wavelengths  is  often  used  for  detection  of 
proteins  in  solutions,  but  not  for  other  analyses.  More  sensitive  analysis  of  the  sec- 
ondary structure  based  on  peptide  bond  configurations  and  interactions  with  light  of 
these  wavelengths  can  be  obtained  with  FTIR,  circular  dichroism,  and  Raman 
spectroscopies. 

The  absorbance  spectra  of  proteins  between  250  and  300  nm  are  the  result  of  a 
combination  of  the  absorbance  spectra  of  the  aromatic  amino  acid  chromophores 
and  disulfide  bonds,  as  described  in  the  introduction.  The  wavelength  maximum  of 
the  spectrum  depends  on  the  specific  amino  acid  composition  and  the  local  environ- 
ment of  individual  chromophores,  which  is  dependent  on  the  tertiary  structure  of  the 
protein  [8,  15].  Modifications  of  any  of  these  residues  including  oxidation  and  ion- 
ization can  affect  the  absorbance  spectra  of  these  chromophores,  and  can  also 
change  the  local  environment  in  that  region  of  the  protein.  In  addition  to  the  intrinsic 
chromophores  of  proteins,  cof actors  can  also  contribute  to  the  absorbance  of  light 
and  often  have  characteristic  spectra  in  the  visible  range  of  the  radiative  spectrum. 

Nucleic  acids  also  absorb  in  the  UV  region  [8,  15];  the  intensity  of  absorbance  is 
affected  by  the  protonation  states  of  the  aromatic  rings  and  by  the  hydrophobic 
environment  created  by  the  stacking  of  the  bases.  Single  stranded  nucleic  acids  have 
much  stronger  absorbance  than  double  stranded  molecules.  This  feature  can  be  used 
to  monitor  the  "melting"  or  loss  of  structure  of  the  nucleic  acids. 

The  dependence  of  absorbance  on  path  length  allows  this  type  of  measurement 
to  be  applied  across  a  broad  concentration  range  for  all  chromophores  by  the  use  of 
cuvettes  with  the  appropriate  path  length,  or  the  recently  developed  variable  path 
length  spectrometers.  As  long  as  the  absorbance  is  greater  than  the  buffer  blank  and 
the  signal  to  noise  ratio  is  acceptable,  the  path  length  can  be  increased  to  allow 
measurement  of  low  concentration  solutions,  or  decreased  to  allow  analysis  of  very 
high  concentration  solutions. 

3.2.1.2  Applications 
Detection 

UV  absorbance  can  be  used  as  a  sensitive  detection  method  for  many  macromolecules. 
For  proteins  it  is  commonly  used  for  detecting  which  fractions  contain  protein 
during  column  chromatography,  and  it  is  also  used  to  detect  the  presence  of  nucleic 
acids.  The  wavelength  used  can  depend  on  the  sensitivity  and  specificity  needed  for 
the  specific  analysis.  Detection  in  the  far  UV  region  is  the  most  sensitive  due  to  the 
large  number  of  peptide  bond  chromophores  in  the  molecule  and  also  has  the  most 


40 


L.O.  Narhi  et  al. 


interference  from  buffer  components,  etc.  Monitoring  of  proteins  at  280  nm  is  used 
if  there  is  a  sufficient  amount  of  protein  present  to  be  detected.  Though  the  absor- 
bance  of  proteins  is  less  at  280  than  at  214  nm,  there  is  significantly  less  interference 
from  buffer  components,  allowing  for  better  differentiation  of  the  protein  signal 
from  the  baseline.  Interference  from  particles,  including  protein  aggregates,  as  well 
as  DNA  or  RNA  if  they  are  present,  can  confound  the  quantitation,  and  a  calibration 
curve  is  often  used  here  as  well.  Absorbance  in  the  UV  will  give  relative  amounts  of 
protein,  for  instance  in  different  fractions  eluting  from  a  column. 

Concentration  Determination 

The  absorbance  of  light  is  one  of  the  oldest,  and  yet  still  most  reliable,  methods  for 
determining  the  concentration  of  chromophores.  The  accuracy  of  the  measurement 
depends  on  the  accuracy  of  the  extinction  coefficient  determination.  For  proteins  the 
theoretical  extinction  coefficient  at  280  nm  in  water  can  be  determined  based  on 
the  molar  absorptivity  (e)  of  the  aromatic  amino  acids  and  disulfide  bonds  present 
in  the  molecule,  as  described  by  Pace  et  al.  [16].  The  extinction  coefficient  at  this 
wavelength  for  the  free  amino  acid  in  water  is  5,540  for  tryptophan,  1,480  for  tyro- 
sine, 125  for  the  disulfide  bonds,  and  10  for  phenylalanine.  The  theoretical  extinc- 
tion coefficient  of  a  protein  is  derived  by  summing  the  contributions  of  all  of  these 
species  in  the  protein  and  dividing  the  result  by  the  molecular  weight  of  the  entire 
protein.  This  is  usually  very  close  to  the  actual  extinction  coefficient  of  the  protein 
in  the  aqueous  buffers  most  commonly  used  for  protein  characterization.  Once  the 
molar  extinction  coefficient  of  a  protein  solution  has  been  determined  the  concen- 
tration of  any  sample  can  be  measured  using  the  Lambert-Beer  law  as  described 
above  (3.5).  This  method  allows  not  only  quantitation  of  individual  protein  samples 
but  also  the  comparison  of  properties  such  as  enzyme  activity  or  binding  affinity 
across  different  protein  species  or  solutions,  since  they  can  be  normalized  for  pro- 
tein concentration. 

The  absorbance  spectra  of  proteins  and  nucleic  acids  overlap  between  250  and 
280  nm,  with  proteins  having  a  much  weaker  molar  absorptivity  (smaller  extinction 
coefficient).  If  a  sample  contains  both  protein  and  nucleic  acids  it  is  necessary  to 
correct  for  the  DNA  contribution  before  calculating  the  protein  concentration.  The 
reverse  is  not  true  because  of  the  much  smaller  protein  contribution  at  260  nm  rela- 
tive to  that  of  the  nucleic  acids.  The  Warburg-Christian  equation  uses  the  ratio  of 
the  absorbance  at  260  to  that  at  280  nm  (A26o/A28o)  to  correct  for  the  contribution  of 
nucleic  acid  to  the  protein  absorbance  spectrum  in  this  wavelength  range.  In  water 
this  ratio  is  0.57  for  pure  protein  solutions,  about  1.8  for  pure  DNA,  and  about  2  for 
solutions  of  pure  RNA  [14]. 

The  determination  of  concentration  from  the  absorbance  of  a  protein  at  280  nm 
can  be  complicated  by  light  scattering  from  larger  protein  aggregates  or  particles. 
As  described  previously,  the  wavelength  maximum  for  light  scattering  is  dependent 
on  the  size  of  the  species  responsible  for  the  scattering,  extending  well  into  the  vis- 
ible region;  however,  there  can  also  be  contributions  as  low  as  280  nm  from  this 
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phenomenon.  Consequently  light  scattering  causes  an  apparent  increase  in  the 
absorbance  of  the  solution  due  to  the  interference  of  the  transmission  of  light 
through  the  sample,  and  can  result  in  overestimating  the  concentration  of  the  pro- 
tein. Removal  of  the  scattering  species  prior  to  analysis  should  be  attempted  when- 
ever possible.  When  light  scattering  is  known  to  occur  and  the  species  responsible 
cannot  be  removed,  approximate  corrections  can  sometimes  be  made  [7]. 

Structural  Determination 

As  shown  in  Fig.  3.2,  the  typical  UV  absorbance  spectrum  of  a  protein  is  a  broad 
peak  that  is  made  up  of  the  contributions  of  all  of  the  Trp  and  Tyr  residues  as  well 
as  the  disulfide  bonds  and  Phe  in  a  molecule.  It  is  possible  to  differentiate  these 
individual  contributions  with  a  combination  of  sensitive  detectors  and  sophisticated 
lineshape  data  analysis  [7,  17-19].  The  underlying  contributions  of  the  individual 
aromatic  amino  acids  can  often  be  differentiated  by  taking  the  second  or  fourth 
derivative  of  the  absorption  spectrum  (Fig.  3.2).  Changes  in  the  environment  of  the 
chromophores  can  be  detected  with  this  type  of  spectroscopy  [20,  21].  While  not  as 
sensitive  as  the  near-UV  circular  dichroism  (CD)  or  fluorescence  spectroscopies 
which  are  described  in  the  following  sections,  the  relative  ease  of  acquiring  this  type 
of  data  makes  it  an  important  component  in  the  toolbox  of  biophysical  techniques 
for  analysis  of  macromolecules,  especially  proteins. 

Other  Chromophores 

The  presence  of  important  biological  cofactors  can  also  be  exploited  with  light 
spectroscopy,  in  this  case  often  in  the  visible  range  [21,  22].  Absorbance  of  the 
metal  containing  chromophores  is  sensitive  to  the  oxidation  state  of  the  metal  as 
well  as  its  local  environment  within  the  protein,  and  therefore  these  characteristics 
can  be  used  to  probe  both  the  ionization  state  of  the  metal  and  the  binding  affinity 
to  the  protein  [21,  22].  For  instance,  for  the  cytochrome  family  and  other  heme- 
containing  molecules,  absorbance  results  in  Soret  bands,  an  intense  peak  in  the  blue 
wavelength  region  of  the  visible  spectrum,  that  is  sensitive  to  the  spin  state  of  the 
electrons  and  the  oxidation  state  of  the  metal.  Thus  visible  spectra  have  been 
employed  to  follow  electron  flow  during  the  catalytic  cycle  of  cytochrome  P-450's 
and  the  other  cytochromes  important  for  energy  metabolism,  as  well  as  other 
metallo-proteins.  Changes  in  the  spin  state  or  absorbance  of  the  metal  can  be  used 
to  study  the  kinetics  of  ligand  binding.  The  interaction  of  the  flavin,  retinoid,  and 
other  7c-conjugated  cofactors  with  visible  light  is  often  necessary  for  the  biological 
function  of  the  protein  as  well;  for  example  the  initial  steps  of  photosynthesis,  or  the 
functioning  of  the  cones  and  rods  in  our  eyes.  The  cof actor  nicotinamide  adenine 
dinucleotide  (NAD)  shows  a  substantial  increase  in  absorbance  at  340  nm  when 
converted  to  NADH,  an  important  reaction  in  energy  metabolism.  This  change  in 
absorbance  intensity  has  been  used  extensively  in  enzyme  kinetic  studies  of  systems 
that  utilize  this  cofactor. 
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Fig.  3.3  The  relation  of  ellipticity  to  the  differential  absorption  of  circularly  polarized  radiation. 

(a)  Plane-polarized  radiation  is  made  up  of  left-  and  right-handed  circularly  polarized  components, 

(b)  interaction  of  the  radiation  with  a  chiral  chromophore  leads  to  unequal  absorption 

3.2.1.3    Concluding  Comments  on  UV  Absorbance 

UV  absorbance  is  perhaps  the  simplest  interaction  of  macromolecules  with  light. 
It  can  be  used  for  detection  of  proteins  and  DNA,  and  for  determination  of  the  con- 
centration of  these  species.  Changes  in  the  conformation  of  proteins  can  be  followed 
by  taking  the  derivatives  of  the  absorbance  spectra.  It  can  also  be  a  very  useful  tool 
for  studying  systems  with  cofactors  such  as  the  heme  group  or  conjugated  n  systems 
like  the  flavins.  While  it  is  a  relatively  low  resolution  technique,  its  wide  availability 
and  ease  of  use  make  it  perhaps  the  most  frequently  used  of  the  light  spectroscopy 
methodologies. 


3.2.2    Circular  Dichroism 

3.2.2.1  Theory 

As  described  above,  the  local  environment  can  affect  the  interactions  of  chromo- 
phores  with  light,  and  this  can  be  used  to  analyze  the  structure  of  macromolecules. 
Circular  dichroism  (CD),  the  differential  absorption  of  the  left-  and  right-circularly 
polarized  components  of  plane-polarized  electromagnetic  radiation  (Fig.  3.3),  is 
frequently  used  to  monitor  structural  changes. 

A  CD  signal  arises  when  a  chromophore  is  chiral  (optically  active);  a  CD  spec- 
trum is  obtained  when  circular  dichroism  is  measured  as  a  function  of  wavelength. 
Since  this  methodology  is  based  on  absorbance,  the  chromophores  in  proteins  are 
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the  peptide  bonds,  aromatic  amino  acids,  disulfide  bonds  and  cof actors  discussed 
in  the  introduction.  Exposure  of  these  chromophores  to  a  chiral  field  introduces 
perturbations  that  increase  the  optical  activity  of  the  molecule.  In  the  far  UV  region 
(240-190  nm),  the  peptide  bond  absorption  can  be  used  to  assess  the  content  of 
regular  secondary  structural  features  such  as  oc-helix  and  p-sheet.  The  CD  spectrum 
in  the  near-UV  region  (320-260  nm)  reflects  the  environments  of  the  aromatic 
amino  acid  side  chains  and  can  be  used  to  assess  changes  in  the  tertiary  structure  of 
the  protein.  Other  nonprotein  chromophores  such  as  flavin  and  heme  moieties  can 
give  CD  signals  in  the  range  of  300-700  nm  which  depend  on  the  environment  of 
the  chromophore.  DNA  in  its  helical  form  can  also  have  a  CD  signal;  in  this  case  the 
signal  maximum  is  around  260  nm. 

CD  is  one  of  the  most  sensitive  techniques  for  determining  structures  and  moni- 
toring structural  changes  of  biomolecules,  though  like  all  spectroscopic  methods  it 
can  only  determine  the  average  of  a  molecular  population,  and  cannot  provide  the 
high-resolution  structural  data  available  from  X-ray  crystallography  or  nuclear 
magnetic  resonance  (NMR).  Because  of  its  convenience  and  applicability  under  a 
wide  variety  of  experimental  conditions,  CD  can  be  used  for  many  applications, 
including  exploring  protein-ligand  interactions,  assessing  conformational  changes 
and  studying  protein  folding.  For  proteins  and  peptides,  CD  data  are  usually  reported 
in  units  of  mean  residue  ellipticity  (degrees  squared  x  centimeters  per  decimole). 

[el  =    Mr    e  h  (3.6) 

[0] :  Mean  residue  ellipticity  (degree  cmVdecimole) 

Mr:  mean  residue  weight 

C:  protein  concentration  (mg/mL) 

d:  Path  length  of  the  cell  (cm) 

9obs:  Ellipticity 


3.2.2.2  Applications 
Secondary  Structure 

In  the  far  UV  region  (250  nm  and  below)  the  CD  signal  from  proteins  is  the  result 
of  multiple  electronic  transitions  arising  from  energy  transfer  to  all  of  the  peptide 
bonds.  The  difference  in  the  hydrogen  bonding  states  of  different  forms  of  second- 
ary structure  in  proteins  (e.g.,  a-helix,  P-sheet,  turn  and  disordered)  results  in  char- 
acteristic CD  signals  (Fig.  3.4)  that  can  provide  information  about  the  amounts  and 
different  forms  of  regular  secondary  structure  [23-32]  in  the  protein  under  the  solu- 
tion conditions  being  analyzed.  For  example,  a-helices  have  a  strong  and  character- 
istic CD  spectrum  with  an  intense  positive  band  at  192  nm  and  two  negative  bands 
at  208  and  222  nm;  P-sheets  display  a  fairly  intense  positive  band  at  198  nm  and  a 
negative  band  at  215  nm;  turns  display  a  positive  band  at  205  nm,  a  very  weak  nega- 
tive band  at  230  nm  and  an  intense  negative  band  at  195  nm  whereas  completely 
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Fig.  3.4  Far  UV  CD  spectra 
associated  with  various  types 
of  secondary  structure.  Solid 
line  a-helix,  long  dashed  line 
antiparallel  P- sheet,  dotted 
line  type  I  P-turn,  cross 
dashed  line  extended  31 -helix 
or  poly  (Pro)  II  helix,  short 
dashed  line  irregular 
structure.  The  data  are 
adapted  from  Kelly  et  al.  [23] 
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unfolded  proteins  (random  coils)  display  an  intense  negative  band  at  198  nm.  In 
addition,  aromatic  residues  can  occasionally  contribute  to  this  part  of  the  spectrum; 
Tyr  and  Phe  can  have  positive  bands  near  220  nm,  and  Trp  can  also  have  a  negative 
band  at  214  nm.  However,  in  helical  proteins  these  contributions  are  often  masked 
by  the  strong  negative  ellipticity  of  the  alpha-helix.  Many  additives  such  as  salts  and 
surfactants  (excipients)  commonly  used  in  protein  formulations  have  absorbance  in 
the  region  below  200  nm.  The  increase  in  the  total  absorbance  of  the  solution  in  the 
presence  of  excipients  leads  to  a  decrease  in  the  signal  to  noise  ratio,  and  greater 
experimental  variability  in  this  region,  making  it  difficult  to  detect  small  differences 
in  the  CD  signal.  When  possible  spectra  should  be  collected  down  to  170  nm  or 
below,  but  in  the  aqueous  buffers  most  commonly  used  for  DNA  and  protein  solu- 
tions the  lower  limit  is  usually  closer  to  200  nm. 

There  are  a  number  of  algorithms  that  use  data  from  far  UV  CD  spectra  to  pro- 
vide an  estimation  of  the  secondary  structure  composition  of  proteins.  Most  of  these 
procedures  are  based  on  datasets  comprised  of  CD  spectra  of  proteins  whose  struc- 
tures have  been  solved  by  X-ray  crystallography  and  span  the  different  types  of 
secondary  structures  [24,  25].  Widely  used  algorithms  include  SELCON  (self- 
consistent)  [26],  VARSLC  (variable  selection)  [27],  CDSSTR  [28],  K2d  [29]  and 
CONTIN  [30].  An  online  server  DICHROWEB  [31,  32]  has  been  developed  by  the 
laboratory  of  Dr.  Bonnie  Wallace  at  the  University  of  London,  which  allows  data  to 
be  entered  and  analyzed  by  various  algorithms  with  a  choice  of  databases  to  esti- 
mate the  secondary  structure  content  and  the  structural  family  of  a  protein. 
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Fig.  3.5  Near-UV  CD  spectra  of  five  different  proteins 
Tertiary  Structure 

Proteins  also  have  a  CD  signal  in  the  240-340  nm  range  because  of  the  absorbance 
of  light  by  the  aromatic  amino  acid  side  chains  and  the  disulfide  bonds  in  the  near- 
UV  region  of  the  spectrum.  The  intensity  and  sign  (negative  or  positive)  of  the  CD 
signal  of  these  residues  depends  on  the  asymmetric  environment  in  which  they  are 
located,  determined  by  the  three-dimensional  structure  of  the  protein,  and  thus  pro- 
viding a  fingerprint  representative  of  the  tertiary  structure  of  the  folded  protein 
[33-36].  Each  of  the  aromatic  amino  acids  has  a  characteristic  CD  wavelength  pro- 
file that  corresponds  to  its  absorbance  spectrum.  Trp  has  a  peak  close  to  290  nm 
with  fine  structure  between  285  and  305  nm;  Tyr  has  a  peak  between  270  and 
285  nm,  with  a  maximum  at  278  nm  and  a  shoulder  at  longer  wavelengths  often 
obscured  by  the  Trp  band;  Phe  is  a  triplet  between  250  and  265  nm  with  the  stron- 
gest peak  at  257  nm.  Disulfide  bonds  also  absorb  in  the  near-UV  region  (weak  broad 
absorption  bands  from  250  to  280  nm),  the  changes  in  the  dihedral  angle  of  the 
disulfide  bond  will  result  in  a  change  in  the  intensity  of  the  signal  in  this  region  of 
the  spectrum.  The  actual  shape  and  magnitude  of  the  near-UV  CD  spectrum  of  a 
protein  will  depend  on  the  protein  primary  sequence,  the  number  of  each  type  of 
aromatic  amino  acids  present,  their  mobility,  and  the  nature  of  their  environment 
(H-bonding,  polarity,  and  polarizability).  Figure  3.5  shows  an  overlay  of  near-UV 
CD  spectra  of  five  different  proteins.  For  proteins,  the  absence  of  regular  structure 
results  in  a  symmetric  environment,  and  zero  CD  intensity,  while  an  ordered 


46 


L.O.  Narhi  et  al. 


E.  coli  Fc 


2000 


LU 

Q 

O 


200 


220 

Wavelength  (nm) 


240 


Fig.  3.6  Far  UV  CD  spectra  of  the  E.  co//-derived  Fc  molecules  as  a  function  of  pH 


structure  and  the  resulting  local  asymmetric  environment  of  the  aromatic  amino 
acid  residues  and  disulfide  bonds  results  in  a  characteristic  spectrum.  Unlike  far-UV 
CD  spectroscopy  (170-240  nm),  features  in  the  near-UV  CD  (240-340  nm)  spectrum 
cannot  be  assigned  to  any  particular  three-dimensional  structure.  Rather,  near-UV 
CD  spectra  provide  information  on  the  nature  of  the  chromophores  in  the  proteins, 
the  interactions  of  these  residues  with  other  amino  acids  in  close  proximity,  and 
their  interactions  with  the  solvent.  This  methodology  can  be  used  to  study  unfolding 
and  folding  of  proteins  under  specific  solution  conditions  as  changes  in  the  spectra 
reflect  changes  in  the  local  environment  of  the  chromophores  and  the  degree  of 
solvent  exposure,  resulting  in  changes  in  the  "fingerprint." 

Conformational  Changes  in  Proteins 

CD  spectra  are  dependent  on  the  protein  conformation,  and  therefore  can  be  used  to 
monitor  changes  in  the  protein  structure  that  are  induced  by  temperature,  mutations, 
pH,  denaturants,  binding  interactions,  etc.,  to  follow  the  kinetics  of  protein  unfold- 
ing and  to  determine  the  amount  and  type  of  secondary  structure.  As  an  example, 
Figs.  3.6  and  3.7  show  the  far-  and  near-UV  CD  spectra,  respectively,  of  the  E.  coli- 
derived  Fc  domain  as  a  function  of  pH  (the  Fc  is  the  crystallizable  fragment  of  an 
IgG  that  contains  the  CHI  and  CH2  constant  domains).  The  data  demonstrate  that 
as  the  pH  decreases  from  pH  7  to  pH  2  the  far-UV  CD  spectra  of  the  Fc  domain  loses 
the  band  at  218  nm  which  is  characteristic  of  p-sheet  structure.  At  the  same  time  the 
intensity  of  the  near-UV  CD  signal  decreases  towards  the  baseline  signal  of  unfolded 
protein.  The  far-  and  near-UV  CD  data  suggest  that  as  the  pH  decreases  from  pH  7 
to  pH  2,  the  secondary  and  tertiary  structure  of  the  Fc  domain  unfold  significantly. 
These  data  can  be  used  to  generate  unfolding  rates,  and  identify  solution  conditions 
under  which  the  protein  unfolds,  but  cannot  provide  information  on  the  specific 
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Fig.  3.7  Near-UV  CD  spectra  of  the  E.  co/Z-derived  Fc  molecules  as  a  function  of  pH 

region  or  sequence  of  the  protein  involved  in  the  changes  being  monitored.  Mutations 
or  labeling  of  the  protein  can  be  used  to  provide  more  information  about  the  primary 
sequence  involved,  but  this  technique  can  never  provide  structural  information  on  an 
atomic  level  like  X-ray  crystallography  or  NMR.  Assessing  stability  to  solution  con- 
ditions is  very  important  as  a  screening  tool  during  the  development  of  protein  ther- 
apeutics, and  is  used  to  help  identify  candidates  that  can  survive  manufacturing, 
storage,  and  delivery  conditions. 

DNA  Conformation  and  Conformational  Changes 

DNA  also  can  adopt  different  conformations,  including  the  B -family  of  structures, 
the  A-form,  the  Z-form,  guanine  quadruplexes,  cytosine  quadruplexes,  triplexes, 
and  other  less-characterized  structures.  The  different  forms  of  DNA  give  rise  to  dif- 
ferent CD  signals  [37] .  Spectra  of  the  B-DNA  forms  contain  a  positive  band  between 
260  and  280  nm  and  a  negative  band  around  245  nm;  the  A-DNA  form  contains  a 
dominant  positive  band  at  260  nm  and  a  negative  band  at  210  nm;  in  the  Z-DNA 
form,  the  base  pairs  have  an  opposite  orientation  with  respect  to  the  backbone  of  the 
B-  and  the  A-forms,  resulting  in  a  spectrum  with  a  negative  band  at  about  290  nm, 
a  positive  band  around  260  nm  and  an  increasingly  negative  signal  at  205  nm 
(Fig.  3.8).  Therefore  CD  can  be  used  to  study  the  conformational  transitions 
between  different  states  of  DNA,  generating  a  titration  curve  of  changes  in  DNA 
structure  as  a  function  of  various  solution  conditions.  By  following  the  CD  titration 
curve  generated,  it  is  possible  to  distinguish  between  gradual  changes  within  a  single 
DNA  conformation  and  cooperative  isomerization  between  discrete  structural  states. 
This  methodology  can  also  be  used  to  follow  the  kinetics  of  the  appearance  of 
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Fig.  3.8  UV  CD  spectra  of  A-  and  B-forms  DNA  (left)  and  Z-form  DNA  (right) 


particular  conformers  and  to  determine  their  thermodynamic  parameters.  CD  was 
also  used  to  study  DNA  and  RNA  conformations  in  a  receptor-mediated  DNA  and 
RNA  delivery  complex  [38] 


Quantitative  Comparison  of  CD  Spectrum 

Quantitative  comparison  of  CD  spectra  remains  a  challenge  due  to  their  broad  spec- 
tral features  and  the  inclusion  of  positive  and  negative  signals.  One  method  com- 
monly used  by  researchers  in  the  field  is  deconvolution  of  the  CD  spectra  of  a 
protein  or  DNA  to  determine  contributions  of  individual  structural  elements,  fol- 
lowed by  determination  of  the  percentage  of  each  component  in  the  protein  second- 
ary structure  or  each  DNA  form.  Comparability  is  then  determined  by  comparing 
changes  in  the  percentage  of  each  component  [28,  39-42].  The  accuracy  and  effec- 
tiveness of  the  deconvolution  method  is  limited  due  to  the  overlapping  CD  signals 
of  the  amino  acid  side  chains  and  conformational  features  of  the  different  second- 
ary structure  elements  in  a  protein,  as  well  as  the  lack  of  defined  conformational 
CD  features  for  protein  tertiary  structure,  and  the  lack  of  defined  characteristic  CD 
features  for  the  various  forms  of  DNA.  Therefore,  deconvolution  of  CD  spectra  is 
not  routinely  carried  out  for  protein  comparability  assessments  in  the  biopharma- 
ceutical  industry.  In  recent  publications  by  several  groups,  the  authors  have  used 
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Fig.  3.9  Comparison 

of  visible  CD  spectra 

of  CulPrP-(91-115) 
and  CulPrP-(90-126). 

Reproduced  with  permission, 
from  Klewpatinod  et  al., 
2007,  Biochem.  /.,  404, 
393-402.  ©  The  Biochemical 
Society  [48] 
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software  from  Thermo,  Thermo  Electron  OMNIC  software-QC  compare  function 
[43,  44]  to  compare  the  entire  CD  spectra  in  both  the  near-  and  far-UV  CD  region 
for  protein  to  systematically  qualify  a  CD  method  based  on  its  precision  and  sensi- 
tivity [45-47]. 


Protein  Binding 

Visible  CD  spectroscopy  is  a  very  powerful  technique  for  studying  metal-protein 
interactions  and  can  resolve  individual  d-d  electronic  transitions  as  separate  bands. 
A  protein  or  polypeptide  in  solution  has  a  CD  spectrum  in  the  visible  region  of  the 
radiative  spectrum  when  a  metal  ion  binds  to  a  protein  and  the  complex  is  in  a  chiral 
environment  [48].  This  can  be  used  to  study  protein-metal  binding,  including  the 
pH  dependence  and  stoichiometries  of  the  interactions.  As  an  example,  Fig.  3.9 
compares  the  visible  CD  spectra  of  1  mol  equivalent  of  Cu2+  ions  bound  at  pH  7.8 
to  two  fragments  of  the  prion  protein,  PrP-(91-115)  and  the  longer  fragment,  PrP- 
(90-126)  [49].  The  significant  difference  between  the  visible  CD  spectra  of  PrP- 
(91-115)  and  PrP-(90-126)  reflects  differences  in  the  relative  affinity  of  the  two 
Cu2+  binding  sites.  Addition  of  1  mol  of  Cu2+  to  PrP-(91-115)  at  pH  7.8  results  in 
-70  %  of  Cu2+  binding  to  His  111  and  -30  %  to  His  96.  In  contrast,  when  the  longer 
fragment,  PrP-(90-126)  is  used,  -95  %  of  the  Cu2+  binds  to  His  111  and  only  5  % 
to  His  96  under  the  same  condition,  resulting  in  a  dramatically  different  visible  CD 
spectrum. 

Aromatic  side  chains  are  frequently  found  in  ligand-binding  sites  and  often  pres- 
ent in  regions  affected  by  conformational  changes  upon  binding.  When  this  occurs 
the  near-UV  CD  spectra  of  proteins  can  also  be  used  to  probe  ligand  binding  and  the 
resulting  changes  in  protein  conformation  [50,  51]. 
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3.2.2.3    Emerging  CD  Technology 
Synchrotron  Radiation  Circular  Dichroism 

Synchrotron  radiation  circular  dichroism  (SRCD)  spectroscopy  is  a  modified  version 
of  the  CD  technique  that  uses  the  intense  light  from  a  synchrotron  source  to  enable 
the  collection  of  data  at  much  lower  wavelengths  than  is  possible  with  a  conven- 
tional CD  instrument  [52,  53].  For  conventional  CD  instruments  with  a  Xenon  (Xe) 
light  source,  the  intensity  of  the  radiation  decreases  dramatically  below  190-1 80  nm. 
N2  used  for  purging  the  sample  compartment,  the  optics,  and  the  H20  present  in 
conventional  solvents  all  absorb  significantly  in  this  region;  this  combination  of  fac- 
tors makes  it  almost  impossible  to  collect  data  below  180  nm.  However,  estimation 
of  secondary  structure  content  is  more  reliable  and  accurate  if  CD  data  at  170  nm 
and  below  are  included;  this  is  possible  with  SRCD,  making  secondary  structure 
determination  far  more  accurate.  One  caveat  with  this  technique  is  that  most  of  the 
commonly  used  buffers  for  protein  therapeutics,  and  for  routine  protein  analysis, 
contain  buffer  components  that  absorb  below  190  nm,  and  therefore  cannot  be  used 
in  these  experiments. 

Vibrational  Circular  Dichroism 

Vibrational  circular  dichroism  (VCD)  extends  circular  dichroism  spectroscopy  into 
the  infrared  and  near  infrared  ranges  (4,000-750  cm-1)  [54-59],  monitoring  the 
molecular  vibrational  and  rotational  transitions.  Proteins  with  different  secondary 
structure  have  characteristic  VCD  spectra,  which  vary  most  for  the  amide  I  mode 
(C=0  stretch),  but  are  also  easily  detectable  for  the  broader  amide  II  and  III  (very 
weak)  modes  (N-H  deformation  and  C-N  stretch).  This  technique  complements 
FTIR,  described  later  in  this  chapter,  but  at  present  is  primarily  used  only  by 
specialist. 

Concluding  Comments  on  CD 

CD  spectroscopy  is  a  powerful  method  for  studying  the  conformational  properties 
of  protein  and  DNA.  It  can  be  used  to  determine  the  secondary  and  tertiary  structure 
of  proteins  and  the  different  types  of  DNA  helices  in  solution  because  of  its  ability 
to  detect,  and  quantify  the  proportion  of  different  conformations  of  these  macromol- 
ecules.  Because  CD  spectroscopy  is  sensitive  and  relatively  inexpensive,  it  has  seen 
widespread  use  in  the  life  sciences,  and  is  often  used  by  the  biopharmaceutical 
industry  as  a  characterization  tool  to  study  the  effect  on  protein  conformation  of 
manufacturing  processes,  formulation  compositions  and  conditions,  storage  condi- 
tions, and  delivery  systems.  Like  most  spectroscopic  techniques,  it  measures  the 
average  overall  solution  property  of  the  population  present  in  a  sample.  For  example 
it  is  impossible  to  determine  whether  a  10  %  decrease  in  signal  corresponds  to  10  % 
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of  the  total  molecular  population  with  a  complete  loss  of  signal  while  the  remaining 
90  %  retains  the  native  structure,  or  whether  10  %  the  entire  population  has  under- 
gone a  10  %  change  in  signal.  CD  is  also  unable  to  determine  the  structure  of  the 
biomolecule  at  the  atomic  level  and  absorption  flattening  artifacts  are  common 
when  analyzing  membrane  protein  solutions  [60].  However,  with  recent  develop- 
ments in  the  quantification  of  CD  spectroscopy  more  objective  numerical  methods 
are  being  used  for  comparing  the  CD  spectra  over  the  entire  spectral  range.  By  fol- 
lowing the  titration  curve  with  CD  spectra,  or  the  time  course  of  changes  in  the  CD 
signals,  one  can  study  the  binding  properties  of  protein  or  DNA  with  particular 
ligands,  or  the  kinetics  of  unfolding/refolding  of  these  molecules.  Thus  CD  is  a  very 
useful  technique  for  higher  order  structure  determination.  The  applications  of  CD 
spectroscopy  described  here  focused  on  the  analysis  of  systems  at  equilibrium;  even 
when  generating  a  denaturation  curve  the  system  is  at  equilibrium  at  each  time 
point.  Stopped-flow  instruments  and  laser  excitation  methods  are  being  developed 
to  follow  protein  dynamics,  and  changes  in  the  molecule  in  ns,  but  are  not  in  scope 
for  this  chapter. 


3.2.3  Fluorescence 

3.2.3.1  Theory 

Fluorescence  spectroscopy  has  made  important  contributions  to  the  study  of  macro- 
molecules,  especially  proteins,  and  has  become  an  invaluable  tool  in  the  develop- 
ment of  the  field  of  biotechnology.  This  method  is  more  sensitive  than  light 
absorption,  but  has  more  limited  applications.  The  most  common  way  to  visualize 
the  physical  phenomenon  of  fluorescence  is  via  the  Jablonski  diagram  shown  in 
Fig.  3.1  [7,  61].  Fluorescence  can  be  described  by  the  equation  shown  below. 

Excitation  :  S0  +  hvex  — >  Sl  (3.7) 
Fluorescence  (emission)  :Sl—>S0+  hvem  +  heat  (3.8) 

where  h  is  the  Planck  constant,  v  is  the  frequency  of  light,  and  S0  and  Si  are  the 
ground  state  and  excited  state  of  the  fluorophore,  respectively. 

Fluorescence  results  from  the  emission  of  light  of  a  lower  energy  than  that 
absorbed  by  the  molecule  following  excitation,  due  to  loss  of  energy  by  non- 
radiative  processes.  This  can  occur  because  absorption  occurs  faster  than  the 
relaxation  of  vibrational  states  (10~15  s  vs.  10~12  s),  in  other  words,  the  electron 
transfer  from  one  energy  level  to  another  occurs  over  a  much  shorter  time  than  the 
atomic  motions  of  vibration,  rotation,  etc.  No  angular  momentum  is  transferred 
during  electron  transfer,  and  the  move  to  a  different  vibrational  state  only  occurs  if 
the  vibrational  states  align.  Thus  the  molecule  in  the  excited  state  has  the  same 
vibrational  states  as  in  the  ground  state.  This  is  the  Franck-Condon  principle, 
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which  states  that  the  likelihood  of  a  transition  from  one  vibrational  state  to  another 
occurring  during  the  transition  between  electronic  states  is  dependent  on  the  overlap 
between  the  two  vibrational  waveforms.  This  relaxation  of  the  electron  to  the  lowest 
vibrational  energy  level  of  an  excited  state  without  emitting  any  light  is  internal 
conversion. 

Once  the  molecule  is  at  the  lowest  vibrational  energy  level  of  an  excited  state, 
there  are  both  non-radiative  and  radiative  processes  by  which  it  can  release  the 
remainder  of  the  energy  and  return  to  the  ground  state.  The  phenomenon  of  fluores- 
cence occurs  when  the  molecule  emits  a  photon  during  its  return  to  the  ground  state. 
In  accord  with  Kasha's  rule,  emission  of  light,  in  this  case  fluorescence,  can  only 
occur  from  the  lowest  excited  state.  The  fluorescence  "lifetime"  is  on  the  order  of 
10~8  s  for  organic  molecules,  which  is  the  approximate  time  the  molecule  spends  in 
the  lowest  vibrational  energy  level  of  the  excited  state.  Following  internal  conver- 
sion the  energy  of  the  emitted  photon  is  lower  than  that  of  the  photon  of  the  incident 
light  used  to  excite  the  molecule;  the  corresponding  shift  of  the  emitted  light  to  a 
lower  frequency  (longer  wavelength)  is  called  a  Stokes  shift. 

Another  way  a  molecule  can  dissipate  energy  and  return  to  the  ground  state  is 
through  Forster  (fluorescence)  resonance  energy  transfer  (FRET)  to  another  fluoro- 
phore  in  physical  proximity  through  dipole-dipole  coupling.  Energy  from  the 
excited  state  of  one  fluorophore  (the  "donor")  can  be  transferred  to  the  excited  state 
of  the  other  fluorophore  (the  "acceptor").  For  this  to  occur  there  needs  to  be  signifi- 
cant overlap  between  the  emission  spectrum  of  the  donor  and  the  absorption  spec- 
trum of  the  acceptor.  There  is  in  fact  no  transfer  of  energy  by  fluorescence  or 
"absorption"  that  occurs  for  the  acceptor  fluorophore,  but  rather  an  excited  state 
energy  transfer  (non-radiative  transfer  of  energy)  occurs  because  of  interactions 
(resonance)  between  the  dipoles  of  the  two  fluorophores.  The  efficiency  of  the 
energy  transfer  is  directly  related  to  the  distance  between  the  two  molecules 
involved,  the  spectral  overlap  of  the  emission  spectrum  of  the  donor  molecule  and 
the  absorption  spectrum  of  the  acceptor  molecule,  and  the  overlap  in  the  orientation 
or  direction  of  the  dipole  moments  of  the  two  molecules  [61].  The  Forster  distance, 

o 

defined  as  the  distance  at  which  50  %  FRET  occurs,  is  in  the  range  of  20-60  A  for 
most  biological  macromolecules.  This  is  given  by  the  following  equation: 


The  FRET  efficiency  (E)  is  the  quantum  yield  of  the  energy  transfer  transition, 
kET  is  the  rate  of  energy  transfer,  kf  is  the  radiative  decay  rate,  and  kt  are  the  rate 
constants  of  other  de-excitation  pathways. 

E  depends  on  the  donor-to-acceptor  separation  distance  r  with  an  inverse  sixth 
power  law  because  of  the  dipole-dipole  coupling  mechanism: 


k 


ET 


E  = 


(3.9) 


kf+  kET  +  Yk{ 


E  = 


1 


(3.10) 


l  +  (r/*0)6 
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with  R0  being  the  Forster  distance  for  the  donor  and  acceptor  pair  at  which  the 
energy  transfer  efficiency  is  50  %.  The  Forster  distance  is  dependent  on  the  overlap 
of  the  emission  spectrum  of  the  donor  and  the  excitation  spectrum  of  the  acceptor 
molecule,  and  the  alignment  of  the  orientation  between  the  dipoles  of  the  two  mol- 
ecules. This  is  expressed  as 


where  Q0  is  the  fluorescence  quantum  yield  of  the  donor  in  the  absence  of  the  accep- 
tor molecule,  k2  is  the  dipole  orientation  factor,  /  is  the  spectral  overlap  factor,  r|  is 
the  refractive  index  of  the  solution,  and  NA  is  Avogadro's  number. 

Time-resolved  fluorescence  spectroscopy  is  a  tool  used  for  monitoring  the  kinet- 
ics of  protein  folding  that  exploits  intrinsic  fluorescence.  For  proteins  containing 
multiple  Trps,  this  approach  can  be  utilized  to  monitor  the  folding  of  the  protein 
surrounding  the  individual  Trp  (by  the  formation  of  the  correct  local  environments). 
In  this  technique  the  sample  is  excited  with  a  pulse  of  light  with  a  duration  that  is 
much  shorter  than  the  fluorescence  lifetime  of  the  sample,  resulting  in  a  population 
of  fluorophores  in  the  excited  state.  Over  time,  with  fluorescence  emission,  these 
excited  state  molecules  will  return  to  the  ground  state.  The  time-dependent  emission 
intensity  is  measured,  and  in  most  cases  there  is  an  exponential  decay  of  intensity 
over  time.  Fitting  the  exponential  decay  plot  yields  the  lifetime  of  the  molecule  (x) 
in  the  excited  state.  This  is  shown  by  the  following  equation. 


x  is  the  fluorescence  lifetime  and  F0  is  the  intensity  at  time  zero. 

3.2.3.2  Applications 
Intrinsic  Fluorescence 

The  sources  of  intrinsic  fluorescence  in  proteins  are  the  aromatic  amino  acids.  In 
most  cases,  the  indole  group  in  Trp  dominates  in  protein  fluorescence,  with  rela- 
tively weaker  contributions  from  Tyr  and  Phe.  Within  proteins,  FRET  can  occur 
between  Trp  and  Tyr  due  to  overlap  between  the  emission  spectrum  of  Trp  and  the 
absorbance  (or  excitation)  spectrum  of  Tyr,  if  the  distance  and  orientation  between 
the  two  amino  acids  in  the  folded  protein  allow  this  to  occur.  Thus  Trp  emission  is 
the  most  frequently  observed  fluorescence  signal  in  proteins. 

The  emission  of  Trp  in  particular  is  highly  dependent  on  its  local  environment, 
making  it  a  valuable  probe  for  monitoring  protein  conformational  changes.  The 
energy  released  by  the  transition  from  the  LUMO  to  the  HOMO  is  dependent  on  the 
interactions  between  the  molecular  dipole  and  the  solvent  or  surrounding  environ- 
ment in  the  protein,  similar  to  the  effect  they  have  on  the  wavelength  of  light 
absorbed  in  absorbance  spectroscopy.  Adjusting  the  wavelength  of  absorption  can 


(3.11) 


(3.12) 
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Table  3.2  Fluorescence  attributes  of  aromatic  amino  acids  in  proteins,  when  present  in  water  at 
pH  7.0  (adapted  from  ref.  [61]) 


Excitation  wavelength 

Emission  wavelength 

Quantum 

Amino  acid 

max  (nm) 

max  (nm) 

yield 

Lifetime  (ns) 

Phenylalanine 

260 

282 

0.02 

6.8 

Tyrosine 

275 

304 

0.14 

3.6 

Tryptophan 

295 

353 

0.13 

3.1 

ensure  that  Trp  fluorescence  is  monitored  with  minimal  contribution  from  Tyr  and 
Phe.  (See  Table  3.2  for  fluorescence  parameters  of  aromatic  amino  acids  in  water.) 
When  excited  at  295  nm,  >95  %  of  protein  intrinsic  fluorescence  emission  is  from 
Trp  and  the  emission  maximum  of  free  or  exposed  tryptophan  in  water  at  pH  7  is 
about  353  nm.  Generally  the  tryptophan  emission  in  proteins  is  significantly  blue- 
shifted  (towards  lower  wavelengths)  because  of  the  lower  polarity  of  the  local  envi- 
ronments created  by  the  peptide  backbone  and  the  fold  of  the  protein,  which  protects 
the  fluorophore  from  the  solvent.  The  amount  of  solvent  exposure  of  the  Trp  depends 
on  the  hydrophobicity  of  its  local  environment  in  the  native  protein;  as  a  protein 
unfolds  the  solvent  exposure  typically  increases  and  the  wavelength  of  maximum 
emission  approaches  that  of  fully  solvent-exposed  Trp  (353  nm).  This  sensitivity  to 
the  local  environment  makes  intrinsic  fluorescence  a  potent  probe  for  the  degree  of 
folding/unfolding  in  a  protein  sample. 

Protein  Conformation 

The  wavelength  of  maximum  fluorescence  and  the  intensity  provide  information  on 
the  environment  of  the  fluorophores  and  thus  the  conformation  of  the  protein.  The 
azurins  provide  an  excellent  illustration  of  the  effect  of  environment  on  Trp  fluores- 
cence. The  azurin  Pae  from  Pseudomonas  aeruginosa  contains  a  single  tryptophan 
at  position  48,  which  is  in  a  highly  hydrophobic  environment  in  the  protein  core 
formed  by  an  eight  stranded  beta  barrel,  resulting  in  an  emission  maximum  at 
308  nm  [62].  Figure  3.10  illustrates  the  red  and  blue  shift  of  the  spectra  that  can 
occur  as  a  result  of  different  Trp  environments  present  in  several  different  proteins. 
The  figure  shows  a  comparison  of  the  intrinsic  fluorescence  spectra  of  growth  fac- 
tor, a  protein  that  contains  one  buried  tryptophan  with  a  maximum  at  about  310  nm, 
and  an  Fc-fusion  protein  (the  IgGl  Fc  fused  to  a  receptor  domain)  that  contains  over 
27  Trps  in  more  solvent-exposed  environments  resulting  in  a  wavelength  maximum 
at  -340  nm.  The  width  of  the  fluorescence  emission  is  also  a  reflection  of  the  num- 
ber of  Trps  and  the  heterogeneity  of  the  environments  in  which  they  are  located  in 
the  folded  protein,  with  broader  spectra  corresponding  to  protein  with  multiple  Trp, 
located  in  environments  with  differing  polarizability. 

Red  or  blue  shifts  of  the  intrinsic  fluorescence  maximum  of  a  protein  can  indi- 
cate conformational  changes  such  as  unfolding  and/or  aggregation.  Upon  denatur- 
ation  of  a  protein  by  6  M  guanidine  hydrochloride  (GdnHCl),  the  Trp  fluorescence 
shifts  to  351  nm,  close  to  the  value  observed  for  the  individual  amino  acid  in  water. 
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Fig.  3.10  Intrinsic 
fluorescence  spectra  of 
growth  factor  protein  (blue) 
and  Fc-fusion  protein  (red) 
showing  significant 
differences  in  their 
wavelength  at  maximum 
fluorescence  [85] 
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The  intensity  of  fluorescence  of  Trp  and  Tyr  residues  is  also  dependent  on  the 
polarity  of  the  solvent  environment  and  nearby  residues.  The  interior  hydrophobic 
environment  of  proteins  inhibits  non-radiative  processes  and  promotes  high  fluores- 
cence quantum  yields  and  therefore  higher  intensity  of  fluorescence,  while  exposure 
to  a  polar  solvent  environment  in  unfolded  proteins  results  in  a  lower  quantum  yield 
and  decreased  fluorescence  intensity.  In  this  context,  Latypov  et  al.  have  shown  that 
the  emission  spectrum  of  Il-IRa  undergoes  a  50  %  loss  in  intensity  when  denatured 
by  urea  [63].  However,  in  certain  instances  unfolding  increases  the  fluorescence  of  the 
protein;  this  occurs  if  the  Trp  fluorescence  is  quenched  by  nearby  amino  acids  in  the 
folded  state,  in  particular  nearby  disulfide  groups,  amides  and  protonated  His  [64]. 

The  intensity  of  intrinsic  fluorescence  of  a  protein  can  also  be  affected  by  protein 
concentration  and  temperature.  The  linear  relationship  of  protein  concentration  and 
fluorescence  intensity  is  limited  to  very  low  protein  concentrations.  As  protein  con- 
centration increases,  the  inner  filter  effect  occurs  due  to  the  optical  density  of  the 
protein  itself,  which  will  cause  decreased  fluorescence  intensity  compared  to  that  in 
an  infinitely  diluted  solution.  The  intrinsic  fluorescence  intensity  of  a  protein 
decreases  with  increasing  temperature  before  the  onset  of  an  unfolding  event  due  to 
the  quenching  effect  brought  upon  by  higher  temperature. 

Conformational  changes  and  kinetics  of  folding  have  been  extensively  investi- 
gated for  human  granulocyte  colony  stimulating  factor  (G-CSF)  [65-67].  In  this 
case  not  only  does  the  environment  of  the  Trp  residues  change  with  unfolding,  but 
the  efficiency  of  energy  transfer  from  nearby  Tyr  residues  decreases  as  well. 
Decreasing  the  pH  from  6  to  2.5  caused  a  decrease  in  Trp  fluorescence,  with  a 
concurrent  increase  in  the  intensity  of  Tyr  fluorescence  [8].  Native  G-CSF  has  two 
Trp  residues,  at  positions  58  and  118.  Proteins  with  individual  tryptophans  mutated 
by  site-directed  mutagenesis,  showed  that  Trp  58  was  solvent  accessible  with  an 
emission  maximum  of  350  nm  readily  quenched  by  GdnHCl,  while  Trp  118  had 
an  emission  maximum  of  344  nm  and  required  a  higher  concentration  of  GdnHCl 
for  quenching  [66].  As  the  protein  unfolds  changes  in  the  tertiary  and  secondary 
structure  also  increase  the  distance  between  a  Tyr  and  Trp  118,  decreasing  the 
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efficiency  of  energy  transfer.  This  results  in  a  decrease  in  the  Trp  fluorescence,  and 
an  accompanying  increase  in  fluorescence  of  the  Tyr. 

The  fluorescence  lifetime  can  also  provide  information  on  the  protein  fold  and 
the  environment  surrounding  the  Trp.  In  the  case  of  G-CSF,  the  kinetics  of  GdnHCl- 
induced  unfolding  and  refolding  has  been  studied  using  time-resolved  fluorescence 
spectroscopy  [67].  The  decay  time  for  G-CSF  in  GdnHCl  concentrations  beyond  the 
equilibrium  denaturation  transition  zone  is  adequately  fit  by  a  single  time  constant 
for  the  unfolding  kinetics,  while  the  refolding  kinetics  required  two  time  constants 
to  adequately  fit  the  behavior,  one  in  the  20-100  ms  range  and  another  in  the  200- 
1,000  ms  range.  Interestingly,  this  was  not  due  to  the  presence  of  the  two  Trps  in  the 
wild-type  molecule,  as  mutagenesis  experiments  showed  that  the  single  Trp  con- 
taining constructs  also  exhibited  the  same  refolding  behavior.  The  data  in  fact  sup- 
ported the  generation  of  an  intermediate  during  the  refolding  process  rather  than  a 
cooperative  two  state  reaction.  Additional  experiments  showed  that  when  the  start- 
ing GdnHCl  concentration  was  below  the  equilibrium  denaturation  transition,  the 
refolding  kinetics  were  best  represented  by  two  intermediates  in  the  process. 

Fluorescence  Quenching 

Fluorescence  quenching  refers  to  the  decrease  in  fluorescence  intensity  of  a  fluoro- 
phore  that  results  from  the  presence  of  a  quencher  such  as  molecular  oxygen,  acryl- 
amide,  etc.  Fluorescence  quenching  can  result  from  multiple  pathways  including 
dynamic/collisional  and  static  quenching.  Collisional  quenching  is  described  by  the 
Stern-Volmer  equation: 

F0/F  =  l  +  KD[Q]  (3.13) 

F0  and  F  are  fluorescence  intensity  in  the  absence  and  presence  of  a  quencher,  KD 
is  the  Stern-Volmer  constant,  [Q]  is  the  concentration  of  quencher.  For  biophysical 
applications,  quenching  experiments  are  mostly  used  to  probe  the  solvent  accessi- 
bility of  Trp  in  the  folded  protein  and  during  protein  folding.  The  larger  the  KD,  the 
more  accessible  the  fluorophores  are  to  the  quenchers,  indicating  greater  exposure 
of  these  residues  to  the  external  solvent.  Examples  of  the  applications  can  be  found 
in  multiple  publications  [68,  69]. 

Extrinsic  Fluorescence 

In  addition  to  the  intrinsic  fluorophores  (the  aromatic  residues  present  in  proteins), 
extrinsic  fluorophores  can  expand  the  application  of  fluorescence  as  a  tool  to  study 
a  variety  of  biochemical  phenomena.  These  fluorophores  may  be  either  covalently 
attached  to  proteins  or  bind  to  proteins  in  a  noncovalent  fashion.  The  amine,  sulfhy- 
dryl,  or  His  amino  acid  side  chains  provide  convenient  reactive  groups  for  labeling 
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proteins  in  a  covalent  manner.  Commonly  used  fluorophores  in  this  category  include 
dansyl  chloride  (DNS-C1)  and  fluorescein  isothiocyanate  (FITC)  [61].  Attachment 
of  a  fluorophore  like  DNS-C1  is  especially  useful  as  this  probe  absorbs  at  350  nm 
and  emits  in  the  520  nm  range  and  thus  does  not  interfere  with  the  intrinsic  fluores- 
cence of  proteins.  DNS-HC1  can  be  a  good  probe  for  fluorescence  polarization 
experiments,  a  measure  of  rotational  diffusion  of  a  protein  because  of  its  short  fluo- 
rescence lifetime  in  the  unbound  state  (-10  ns).  Fluorescence  polarization  or  anisot- 
ropy  occurs  because  fluorophores  absorb  light  in  a  particular  direction,  or  along  a 
particular  vector.  The  extent  that  the  fluorophore  rotates  during  excitation  affects  the 
anisotropy  and  any  process  that  changes  the  rotational  diffusion  of  the  protein,  such 
as  protein-protein  or  protein-membrane  interactions,  can  be  studied  by  following 
changes  in  polarization  of  the  fluorophore.  Some  examples  of  this  approach  using 
extrinsic  fluorescence  to  study  protein-protein  association  and  membrane  microen- 
vironments  can  be  found  in  [70,  71]. 

The  other  classes  of  extrinsic  fluorophores  are  those  that  bind  noncovalently  to 
proteins.  One  of  the  most  frequently  used  members  of  this  class  is  1-anilinonaphtha- 
lene-8-sulfonic  acid  (ANS).  ANS  is  only  weakly  fluorescent  in  water,  but  fluoresces 
strongly  when  bound  to  protein  surfaces.  In  one  of  the  earliest  experiments  using 
this  approach,  ANS  was  bound  to  bovine  serum  albumin  [72] .  Interactions  between 
protein  and  the  ANS  family  of  dyes  typically  increases  as  a  protein  unfolds,  so 
increased  dye  fluorescence  in  the  presence  of  protein  is  an  indirect  measure  of  pro- 
tein unfolding.  This  can  be  used  as  a  probe  for  protein  folding  under  different  solu- 
tion conditions,  and  stresses.  Because  of  this,  ANS  has  become  a  useful  tool  for 
comparing  relative  stability  to  pH,  temperature,  etc.,  during  selection  of  protein 
therapeutic  candidates  for  optimal  stability  [73].  In  an  example  of  this  type  of  study, 
either  two  or  four  copies  of  a  non-Fc  moiety  were  attached  to  the  Fc  domain  of  an 
antibody,  creating  a  series  of  "2x"  or  "4x"  Fc-constructs.  In  the  case  of  the  4x 
Fc-constructs,  it  was  found  that  one  had  more  than  a  twofold  increase  in  ANS  bind- 
ing compared  to  the  other  two  indicating  greater  exposure  of  the  protein  core,  and 
thus  decreased  conformational  stability.  This,  along  with  other  biophysical  data, 
was  used  to  eliminate  this  construct  from  further  development.  There  are  two 
schools  of  thought  regarding  the  mechanism  of  action  of  ANS.  One  theory  is  that 
due  to  the  amphipathic  nature  of  the  dye,  ANS  associates  with  the  nonpolar  regions 
of  the  macromolecule  [61].  However,  a  recent  report  studying  the  binding  of  ANS 
to  human  interleukin-1  receptor  antagonist  suggests,  using  two-dimensional  NMR 
data,  that  ANS  was  bound  in  this  case  to  a  solvent-exposed  positively  charged 
region  of  the  protein.  Other  related  dyes  with  stronger  signals  have  more  recently 
been  developed  and  applied  to  protein  characterization,  including  in  a  high- 
throughput  configuration  [73].  In  this  application  a  96  well  plate  is  heated  while 
fluorescence  is  measured,  with  samples  in  the  wells  differing  either  in  protein  or 
buffer  content.  The  temperature  at  which  unfolding  begins  correlates  with  the  point 
where  fluorescence  of  the  probe  increases,  allowing  a  quick  assessment  of  relative 
stability  between  the  samples  tested. 
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Red  Edge  Excitation  of  Fluorescence 

Another  fluorescence  method  that  is  used  to  study  protein  and  membrane  dynamics 
is  red  edge  excitation.  A  red  edge  excitation  shift  of  fluorescence  is  observed  when 
the  fluorophores  are  in  an  environment  that  is  not  fully  fluid,  such  that  fluorescence 
emission  becomes  dependent  on  the  excitation  wavelength.  Through  the  use  of  dif- 
ferent excitation  wavelengths,  fluorophores  in  different  environments  can  be  moni- 
tored to  investigate  the  stability  and  dynamics  of  proteins  and  membranes.  Examples 
of  this  approach  can  be  found  in  references  from  Lakowicz,  Chattopadhyay,  and 
Thakkar  [69,  74,  75]. 

Forster  (Fluorescence)  Resonance  Energy  Transfer 

FRET  can  be  used  to  determine  the  proximity  of  fluorophores  intrinsic  to  proteins 
and  other  molecules.  Alternatively  molecules  can  be  labeled  specifically  with  fluo- 
rophore  donor/acceptor  pairs  and  the  efficiency  of  transfer  then  used  to  determine 
the  distance  between  the  labeled  sites  in  a  folded  protein,  changes  in  conformation 
induced  by  stress,  protein  unfolding,  etc. 

There  are  several  different  donor/acceptor  pairs  like  CFP  (cyan  fluorescent  pro- 
tein; donor;  447  nm)  and  YFP  (yellow  fluorescent  protein;  acceptor;  514  nm), 
which  can  be  used  to  probe  the  distances  between  the  sites  where  they  are  attached 
within  a  molecule.  In  general,  the  assumption  is  that  the  transition  dipoles  of  the 
donor  and  acceptor  are  randomly  distributed,  but  that  in  certain  constrained  envi- 
ronments, if  they  are  perpendicular  to  each  other,  FRET  does  not  occur.  An  illustra- 
tive example  of  the  application  of  FRET  to  biological  systems  can  be  found  in  Shih 
et  al.  [76].  These  authors  used  FRET  to  understand  the  conformational  changes  that 
occur  in  myosin  because  of  ATP  hydrolysis.  In  order  to  do  so,  they  replaced  all  the 
native  Cys  in  Dictyostelium  myosin-II,  and  then  reintroduced  Cys  in  specific  loca- 
tions on  the  protein  surface,  based  on  the  crystal  structure.  The  only  Cys  that  they 
did  not  mutate  was  Cys  655,  a  highly  conserved  residue  essential  for  function, 
which  the  crystal  structure  indicates,  is  completely  buried  and  therefore  not  acces- 
sible to  either  donor  or  acceptor  dyes  during  the  labeling  reaction.  The  reintroduced 
Cys  were  then  labeled  with  donor  (Oregon  green  488  maleimide)  and  acceptor 
(tetramethylrhodamine-5-maleimide)  fluorescent  dyes.  These  dyes  served  as 
reporters  of  conformational  change  based  on  the  efficiency  of  the  FRET,  which  was 
indirectly  probing  changes  in  the  distance  separating  them.  Indeed,  they  observed 
that  the  efficiency  of  the  FRET  was  in  the  range  of  11-16  %  in  the  absence  of 
nucleotide  (donor  fluorophore  at  position  1 14  or  1 16,  acceptor  fluorophore  at  posi- 
tion 250),  while  in  the  presence  of  Mg-ATP  the  efficiency  of  the  FRET  increased  to 
32-36  %,  and  in  the  presence  of  Mg-ADP  it  was  increased  even  further,  to  57-61  %. 
This  work  supported  a  long  standing  theory  that  there  was  a  70°  conformational 
change  in  the  light  chain  region  of  myosin  because  of  ATP  binding.  Variants  of 
green  fluorescent  protein  (GFP),  such  as  CFP  and  YFP,  can  be  used  as  FRET  probes 
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within  living  cells.  These  have  been  employed  in  biological  systems  to  understand 
interactions  inside  the  cell  [77,  78],  and  how  the  proximity  of  different  proteins 
changes  under  different  conditions,  with  the  CFP  as  the  donor  and  the  YFP  as  the 
acceptor  fluorophore. 

Fluorescence  Microscopy 

Fluorescence  microscopy  is  another  technique  that  utilizes  extrinsic  fluorescence 
labeling,  in  this  case  in  conjunction  with  an  optical  microscope  which  is  used  to 
monitor  fluorescent  emission,  following  the  illumination  of  a  sample  with  light  cor- 
responding to  its  excitation  wavelength.  This  light  is  absorbed  by  the  fluorophores 
(fluorescent  tag)  and  causes  them  to  emit  light  at  a  longer,  lower  energy  wavelength, 
allowing  localization  of  the  fluorescent  tag  within  the  sample  or  structure  being 
studied.  This  fluorescent  light  can  be  separated  from  the  surrounding  radiation  with 
filters  designed  for  that  specific  wavelength,  so  that  only  the  fluorescent  molecules 
are  seen.  The  excitation  filters  ensure  that  light  of  the  appropriate  wavelengths  from 
the  source  (e.g.,  a  Xenon  arc  lamp)  illuminate  the  sample,  while  an  emission  filter 
ensures  that  light  at  the  fluorescence  emission  wavelengths  is  captured  while  exclud- 
ing light  from  other  wavelengths.  Fluorescence  microscopy  has  been  successfully 
employed  in  cell  biology  to  image  protein  distributions  and  cellular  compartments 
within  a  cell.  For  example,  4/,6-diamidino-2-phenylindole  (DAPI)  is  a  dye  that 
selectively  binds  to  A-T  rich  regions  in  DNA,  and  therefore  readily  stains  nuclei 
within  cells. 

Immunofluorescence  is  another  technique  used  to  selectively  stain  proteins  [79, 
80].  Direct  immunofluorescence  uses  a  single  antibody  conjugated  to  a  fluorophore 
to  stain  the  target  protein  of  interest  inside  cells.  Indirect  immunofluorescence  uses 
two  antibodies;  the  first  IgG  is  specific  to  the  target  while  the  second  is  targeted  to 
the  constant  region  of  the  first  (anti-IgG).  Each  approach  has  advantages  and  disad- 
vantages: direct  immunofluorescence  is  more  specific  and  involves  fewer  steps  in 
the  staining  procedure,  but  the  number  of  targets  with  available  primary  antibodies 
directly  conjugated  to  a  dye  is  limited.  The  indirect  immunofluorescent  method  is 
more  widely  applicable  because  the  second  labeled  antibody  recognizes  the  con- 
stant region  of  a  class  of  primary  antibodies,  e.g.,  "rabbit  anti-goat"  means  a  fluo- 
rescently  labeled  secondary  antibody  from  the  rabbit  species  that  recognizes  all 
antibodies  with  a  goat  Fc  domain,  and  can  therefore  be  used  to  visualize  the  location 
of  any  goat  IgG.  Applications  of  this  technique  are  thus  only  limited  by  the  ability 
to  generate  primary  antibody  to  the  target  protein  of  interest.  Immunofluorescence 
has  been  utilized  in  several  studies  to  stain  intracellular  distributions  of  proteins, 
such  as  inclusion  bodies  [80]  and  regulation  of  proteins  associated  with  the  cell 
cycle  [81]. 

In  addition  to  monitoring  fluorescence  of  proteins,  newer  approaches  have  been 
developed  to  monitor  protein-protein  interactions  in  living  cells.  For  example, 
Kim  et  al.  have  shown  that  FRET,  bioluminescence  resonance  energy  transfer 
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(BRET)  and  fluorescence  lifetime  imaging  microscopy  (FLIM)  can  be  adapted  for 
microscopy-based  visualization  of  protein-protein  interactions  [82].  In  this  case, 
they  showed  that  all  three  methods  detected  interactions  between  amyloid  precursor 
protein  (APP)  and  two  proteins  called  APBB1  and  APBB2. 

Solid-State  Fluorescence 

Until  recently  fluorescence  was  limited  primarily  to  the  analysis  of  solutions. 
However,  studying  proteins  in  the  solid  state  is  useful  for  protein  therapeutics  for- 
mulated using  processes  such  as  lyophilization  and  spray  drying  to  produce  a  solid, 
to  circumvent  instabilities  of  the  molecule  which  occur  in  solution.  In  these 
instances,  the  secondary  and  tertiary  structures  of  the  protein  still  need  to  be  moni- 
tored to  determine  if  the  protein  is  stable  during  storage  in  the  solid  state.  While 
FT-IR  is  readily  available  to  assess  secondary  structure  ([83]  see  later  section  of  this 
chapter),  there  are  significant  limitations  in  utilizing  available  techniques,  such  as 
X-ray  diffraction  and  NMR,  to  monitor  changes  in  tertiary  structure  that  could 
affect  protein  folding  and  function  [84]. 

Fluorescence  spectroscopy  offers  a  convenient  approach  to  monitor  changes  in 
the  tertiary  structure  in  the  solid  state.  The  main  limitation  of  this  technique  in  the 
past  has  been  interference  from  scattering  of  light  in  the  same  spectral  region  as  the 
emission  spectrum  of  the  solid.  Sharma  and  Kalonia  [84]  showed  that  a  front-faced 
sample  holder  could  be  used  to  obtain  intrinsic  fluorescence  (Trp)  spectra  in  these 
samples.  Their  data  indicated  that  the  changes  in  Xmax  of  different  proteins  could  be 
correlated  with  the  inherent  differences  in  the  tertiary  structure.  However,  they 
encountered  difficulty  in  quantitatively  correlating  any  changes  observed  in  fluores- 
cence intensity  to  changes  in  tertiary  structure  of  the  samples. 

Recently,  Ramachander  et  al.  [85]  overcame  these  difficulties  using  a  Cary- 
Eclipse  solid-state  holder.  In  this  case,  Af-acetyl  tryptophan  amide  (NATA)  was  used 
to  demonstrate  reproducibility  in  measurements,  as  well  as  to  provide  a  reference 
for  a  "fully-exposed"  Trp  environment.  A  significant  blue  shift  (-21  nm)  was  seen 
between  NATA  in  the  solid  state  (334  nm)  and  the  reconstituted  NATA  from  the 
lyophilized  formulation  (355  nm)  indicating  a  decrease  in  polarity  in  the  solid-state 
environment.  The  Trp  emission  of  one  of  the  proteins  tested  in  the  solid  state  was 
blue  shifted  even  further  (-314  nm).  Figure  3.11  illustrates  the  significant  blue  shift 
for  the  same  protein  in  the  solid-state  vs.  in  solution.  However,  for  a  given  protein, 
these  authors  observed  that  fluorescence  intensity,  and  not  wavelength,  was  a  better 
indicator  of  stability  under  accelerated  stability  conditions.  The  loss  of  fluorescence 
intensity  over  time  at  60  °C  was  well  correlated  (R2  =  89  %)  with  the  appearance  of 
the  tetramer  peak  for  reconstituted  samples  as  analyzed  by  Size  Exclusion 
Chromatography  SEC,  suggesting  that  the  solid-state  fluorescence  measurements 
were  indeed  picking  up  changes  in  protein  conformation  related  to  aggregation  in 
the  solid  state  (Fig.  3.12). 
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Protein  X  in  solution 
Protein  X  lyophilized 
Protein  Z  in  solution 
Prole in-Z  lyophilized 
Protein  Y  in  solution 
Protein  Y  lyophilized 


Fig.  3.11  Comparison  of  intrinsic  fluorescence  of  protein  x,  y,  and  z  in  solid-state  and  solution 
state  in  their  formulation  buffers  [85] 


Use  of  Fluorescence  Detection  in  High  Performance 
Liquid  Chromatography 

Ultraviolet  detection  has  been  the  standard  for  protein  high  performance  liquid 
chromatography  (HPLC).  However,  sometimes  the  analyte  that  is  being  followed 
may  be  in  such  low  abundance  that  it  is  below  the  sensitivity  (limit  of  detection)  of 
detection  of  UV  absorbance.  In  the  case  of  proteins,  concentrations  in  the  ug/mL 
range  cannot  be  easily  detected  by  UV  absorbance  and  in  this  instance  fluorescence 
detection,  usually  by  tryptophan  fluorescence,  is  the  method  of  choice.  In  other 
examples,  fluorescence  has  been  employed  to  improve  selectivity,  such  as  the  detec- 
tion and  quantification  of  the  total  concentration  of  free  homocysteine  in  human 
plasma  by  Araki  and  Sako  [86].  Another  example  demonstrating  the  novel  use  of 
fluorescence  is  the  detection  of  adducts  to  protein  and  to  DNA,  whose  levels  are 
typically  fairly  low  in  biological  samples  [87]. 

Other  Applications 

In  this  section  we  have  focused  primarily  on  applications  of  fluorescence  to  study 
systems  or  samples  in  equilibrium.  This  field  is  constantly  evolving  and  there 
are  other  emerging  technologies  which  have  great  promise  for  the  future.  The  use 
of  fluorescence  for  single  molecule  analysis  is  included  in  Chap.  10.  An  example  of 
fluorescence  correlation  spectroscopy,  another  rapidly  expanding  application 
of  fluorescence,  is  included  in  the  chapter  on  molecular  machines.  The  application  of 
fluorescence  to  the  life  sciences  is  a  dynamic,  rapidly  evolving  field. 

3.2.3.3    Concluding  Comments  on  the  Use  of  Fluorescence 

Fluorescence  is  a  very  powerful  technique  in  the  life  sciences  for  studying  macro- 
molecules  and  their  interactions.  Intrinsic  fluorescence  can  be  used  to  monitor  pro- 
tein conformation,  stability,  and  folding/unfolding  kinetics.  The  development  of 
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Fig.  3.12  Formation  of  HMWS  observed  by  size  exclusion  chromatography,  (a)  SEC  profile  fol- 
lowing incubation  of  the  lyophilized  samples  at  60  °C.  The  SEC  technique  separates  proteins  in 
solution  based  on  their  hydrodynamic  volume.  The  larger  aggregates  and  tetramers  pass  through 
the  column  faster  and  have  an  earlier  retention  time  than  the  native  Pro  Y  dimer  and  lower- 
molecular-weight  species.  Excipients  and  buffer  salts  in  the  sample  are  the  last  peaks  to  be 
observed  in  the  chromatogram.  (b)  Plot  of  intrinsic  fluorescence  intensities  of  protein  Y  at  all  time 
points  in  the  solid  state  vs.  %  tetramer  observed  in  SEC.  (c)  Plot  of  intrinsic  fluorescence  intensi- 
ties of  protein  Y  at  all  time  points  in  the  solid  state  vs.  %  main  peak  observed  in  SEC  [85] 
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Fig.  3.13  Examples  of 
vibrational  modes  of  the 
peptide  backbone 


Bending 

R  «-  H 

\/  / 
C  —  N 

/  \ 
0  R 

^  Stretching 


specific  fluorescent  dyes  that  interact  with  macromolecules  expanded  the  application 
of  fluorescence  beyond  those  proteins  where  Trp  and  Tyr  were  fortuitously  located  in 
regions  of  interest,  to  all  molecules  that  can  be  labeled.  Incorporation  of  fluorescent 
labels  has  made  it  possible  to  interrogate  interactions  between  specific  regions  of  the 
same  protein,  or  between  different  proteins  and  macromolecules  that  can  be  labeled. 
Fluorescent  microscopy  can  be  used  to  visualize  the  interactions  and  localization  of 
these  molecules  at  the  level  of  the  cell  and  organelles.  For  the  applications  discussed 
in  this  section,  fluorescence  spectroscopy  provides  information  on  the  average  envi- 
ronment of  the  ensemble  of  tertiary  structures  present  in  the  sample  being  analyzed; 
the  technique  can  provide  even  more  valuable  information  about  protein  structure  in 
conjunction  with  other  spectroscopic  techniques  such  as  CD,  FTIR,  and  NMR  spec- 
troscopies. The  evolution  of  the  technology  continues  to  expand  to  other  conditions 
and  even  to  the  level  of  individual  molecules.  When  applicable  this  technique  is  a 
powerful  biophysical  tool  for  scientists  in  the  life  sciences. 


3.2.4  FTIR 

3.2.4.1  Theory 

Infrared  (IR)  spectroscopy  measures  the  wavelength  and  intensity  of  the  absorption 
of  infrared  light  by  a  sample  due  to  transitions  in  the  vibrational  state  (i.e.,  stretching 
or  bending  of  a  chemical  bond,  Fig.  3.13)  of  a  molecule,  rather  than  the  transitions 
of  electrons  to  different  energy  levels  as  is  the  case  for  UV  absorbance,  UV  CD,  and 
fluorescence  spectroscopies  described  in  the  previous  sections  of  this  chapter.  The 
vibrational  frequency  and  intensity  of  an  IR  band  depends  on  the  specific  chemical 
bond  strength,  on  the  atoms  of  the  molecule,  and  on  a  change  in  the  dipole  moment 
of  the  bond.  For  proteins,  the  amide  bond  of  the  peptide  backbone  exhibits  charac- 
teristic bands  in  different  regions  of  an  FTIR  spectrum,  designated  amide  A,  B,  I,  II, 
III,  IV,  V,  VI,  VII  bands  (Fig.  3.14).  The  Amide  I,  II,  and  III  bands  are  the  three 
major  bands  of  a  protein  infrared  spectrum.  The  amide  I  band  (1,600-1,700  cm-1)  is 
mainly  associated  with  the  C=0  stretching  vibration.  Different  secondary  structural 
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Fig.  3.14  FTIR  spectrum  with  the  different  amide  regions  labeled 


features  such  as  a-helix  and  p- sheet  have  characteristic  band  frequencies  and  inten- 
sities in  the  amide  I  band  region  [88,  89].  The  Amide  II  band  (1,500-1,600  cm-1) 
results  from  the  combination  of  N-H  bending  and  C-N  stretching  vibrations.  This 
band  is  also  conformationally  sensitive  and  has  been  used  in  conjunction  with 
hydrogen-deuterium  exchange  to  study  changes  in  protein  flexibility  and  dynamics 
[90-92].  Amide  III  bands  (1,200-1,350  cm-1)  are  predominantly  the  result  of  C-N 
stretching  and  C-N-H  in-plane  bending  modes,  and  are  sensitive  to  different  sec- 
ondary structures  [93].  The  combination  of  C=0  and  C-N  stretching,  C-N-H 
bending,  etc.,  results  in  an  IR  spectrum  containing  various  amide  bands  that  are 
sensitive  indicators  of  the  conformation  and  dynamics  of  the  macromolecule  being 
studied.  FTIR  spectroscopy  can  be  used  to  study  changes  in  the  vibrational  state  of 
molecules  in  solution,  on  a  surface,  or  in  the  solid  state  [88-97].  Thus  it  is  a  very 
versatile  technique,  one  of  the  few  that  can  be  used  for  studying  molecular  structure 
of  biological  molecules  regardless  of  the  state  of  the  sample  and  whether  the  sample 
is  colored  or  not. 

The  Fourier  Transform  (FT)  version  of  IR  spectroscopy  allows  the  use  of  a  con- 
tinuum light  source  (such  as  a  Globar)  to  produce  light  over  a  broad  range  of  infrared 
wavelengths  and  faster  and  simultaneous  generation  of  an  IR  spectrum  (intensity  vs. 
frequency)  by  the  use  of  a  Michelson  Interferometer  [90].  This  is  done  by  converting 
the  interferogram  generated  through  measuring  the  light  signals  at  many  discrete 
positions  of  the  moving  mirror  to  an  IR  spectrum  using  Fourier  transformation. 

Proteins,  lipids,  and  nucleic  acids  are  major  components  of  all  living  things; 
FTIR  spectroscopy  can  detect  changes  in  the  molecular  structures  of  all  three  classes 
of  macromolecules,  and  therefore  this  technique  has  been  widely  used  in  the  life 
sciences  in  applications  ranging  from  diagnostics,  to  assessing  cellular  responses,  to 
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protein  conformational  analysis  of  therapeutic  proteins  [88-113].  In  this  chapter  we 
will  focus  on  applications  of  FTIR  to  equilibrium  measurements  of  protein  structure. 
Using  FTIR  for  clinical  diagnostics  is  an  exciting  new  and  evolving  field  [106-108], 
but  is  beyond  the  scope  of  this  chapter. 

3.2.4.2  Applications 

Protein  Secondary  Structure  Analysis 

Over  the  past  25  years  FTIR  spectroscopy  has  been  used  extensively  to  analyze 
protein  secondary  structure  under  many  different  conditions,  including  in  solution 
[91-105],  in  the  solid-state  form  [92-94],  absorbed  on  surfaces  [97-107],  in  biode- 
gradable polymers  [108],  on  aluminum  oxide  [109],  and  on  glass  particles  [110]. 
Early  applications  were  limited  to  solid  state  or  deuterium  oxide  solutions  due  to 
interference  from  strong  IR  absorption  by  water  that  was  difficult  to  accurately  sub- 
tract; this  was  later  addressed  by  the  use  of  very  small  path  length  (i.e.,  <10  urn) 
sample  cells  [91].  Correlation  between  the  secondary  structure  content  of  proteins 
determined  from  X-ray  crystallography  (PDB)  and  the  Amide  I  and  III  regions  of 
the  FTIR  spectra  indentified  specific  bands  that  correspond  to  the  different  types  of 
secondary  structure  (such  as  a-helices  and  P-sheets)  in  aqueous  solution  [91-93,  98, 
99, 113].  The  FTIR  amide  I  and  III  bands  have  been  used  to  identify  the  components 
of  the  secondary  structure  of  a  protein,  and  to  monitor  changes  in  this  structure 
induced  by  changes  in  the  solution  conditions,  by  stress,  upon  binding  to  solid  sur- 
faces, and  by  lyophilization,  etc.  FTIR  is  particularly  sensitive  to  the  presence  of 
P-sheet  secondary  structures  that  generally  have  very  weak  signals  in  the  far-UV 
CD  region,  making  these  two  methodologies  complementary.  Until  recently  a  rela- 
tively high  protein  concentration,  on  the  order  of  10  mg/mL  or  higher,  was  required 
for  FTIR  analyses.  This  has  been  addressed  through  the  development  of  newer 
FTIR  instruments  with  greatly  improved  signal  to  noise  levels,  and  the  use  of  an 
ATR  (attenuated  total  reflectance)  accessory  [114].  With  this  combination  of 
improvements  undistorted  protein  structure  analysis  can  be  achieved  at  single  digit 
protein  concentrations. 

An  example  of  the  analysis  of  protein  secondary  structure  in  the  solid  state  is 
given  by  Griebenow  and  Klibanov,  who  used  FTIR  spectroscopy  in  the  amide  III 
band  region  to  study  changes  in  protein  secondary  structure  of  over  a  dozen  proteins 
following  lyophilization  [112].  They  found  that  dehydration  reversibly  affected  the 
secondary  structures  while  lyophilization  increased  the  P-sheet  content  and  decreased 
the  a-helix  content  significantly  for  all  proteins  tested.  Prestrelski  et  al.  [93]  showed 
that  dehydration  induced  conformational  changes  in  proteins,  and  that  these  changes 
could  be  inhibited  by  the  inclusion  of  stabilizers  such  as  sucrose,  based  on  changes 
in  the  Amide  I  region  of  the  protein  FTIR  spectra.  The  application  of  FTIR- ATR  to 
the  study  of  kinetics  of  protein  adsorption  and  changes  in  protein  secondary  struc- 
ture and  orientation  upon  adsorption  to  surfaces  has  been  reviewed  by  Chittur  [113]. 
The  utility  of  using  FTIR  for  protein  secondary  structure  characterization  for  many 


66 


L.O.  Narhi  et  al. 


different  protein  families  including  enzymes,  cytokines/growth  factors,  and  therapeutic 
proteins  is  demonstrated  by  the  abundance  of  publications  in  this  area,  a  few  of 
which  are  included  in  this  section. 


Tertiary  Structure 

The  IR  bands  from  the  aromatic  side  chains  of  the  amino  acids  in  proteins  are 
especially  sensitive  conformational  changes.  FTIR  and  its  difference  spectroscopy 
have  been  widely  used  for  analysis  of  protein  conformation  and  to  obtain  detailed 
local  information  around  active  sites,  by  monitoring  the  amide  bands  [88-102]  and 
the  specific  functional  groups  involved  [115,  116].  The  IR  properties  of  most 
amino  acid  side  chains  have  been  used  to  study  the  mechanism  of  protein  reactions. 
A  comprehensive  review  by  Barth  on  the  infrared  absorption  of  amino  acid  side 
chains,  a  less  common  use  of  FTIR,  describes  how  the  IR  properties  of  side  chains 
have  been  determined  and  used  to  study  changes  in  protein  structures  and  enzy- 
matic reactions  [117]. 

FTIR  Difference  Spectroscopy 

FTIR  difference  spectroscopy  is  a  modification  of  the  IR  methodology  that  has 
proven  successful  in  following  the  molecular  details  of  protein  function  and  mecha- 
nisms of  action,  due  to  its  selectivity  and  sensitivity.  Applications  of  FTIR  differ- 
ence spectroscopy  have  been  reviewed  extensively  by  Mantele,  Zscherp,  and  Barth 
[118-120].  This  technology  was  originally  developed  for  the  investigation  of  light- 
induced  reactions  of  photo-reactive  proteins.  FTIR  difference  spectroscopy  can  also 
be  used  for  the  study  of  redox  proteins  by  the  use  of  electrochemical  cells,  for  the 
study  of  many  different  enzymes  by  the  use  of  photolabile  effector  molecules,  or  as 
a  more  general  method  for  the  study  of  buffer  and  ligand  effect  by  the  use  of  immo- 
bilized proteins  on  ATR  crystals. 

H/D  Exchange  for  Dynamic  Structure  Analysis 

Hydrogen  to  deuterium  (H/D)  exchange  occurs  quite  rapidly  for  solvent-exposed 
protons  in  N-H  and  O-H  bonds  when  proteins  are  placed  in  deuterium  oxide,  2H20, 
or  D20  ("heavy  water").  H/D  exchange  can  be  monitored  by  mass  spectrometry 
(MS),  NMR,  and  FTIR  to  study  protein  conformation  and  dynamics  [121, 122].  H/D 
exchange  in  conjunction  with  MS  or  NMR  can  provide  information  on  the  regions 
of  the  protein  polypeptide  chain  that  are  in  the  most  flexible  parts  of  the  molecules, 
discussed  in  other  chapters  in  this  volume  [123].  The  exchange  of  hydrogen  (!H)  to 
deuterium  (2H)  in  the  protein  peptide  bonds  affects  the  frequency  and  intensity  of 
the  amide  I  and  II  bands  of  the  FTIR  spectrum  of  the  molecule.  H-D  exchange 
allows  the  monitoring  of  the  flexibility  and  dynamics  of  the  protein  structure. 
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FTIR  in  conjunction  with  hydrogen  deuterium  exchange  has  been  used  to  study 
protein  dynamics  and  identify  the  secondary  structures  that  are  the  most  flexible 
within  a  protein.  Changes  in  the  flexibility  of  a  protein  both  in  solution  and  in  the 
solid  state  under  different  conditions,  and  as  a  result  of  mutations,  can  also  be  probed 
[99,  124-132].  This  is  achieved  by  measuring  the  integrated  intensity  of  the  amide  I 
and  amide  II  bands  and  following  the  changes  in  the  ratio  as  a  function  of  time.  This 
complements  MS  or  NMR  analyses  and  can  provide  conformation-specific  informa- 
tion that  MS  cannot. 

An  example  of  the  type  of  information  that  can  be  gained  using  FTIR  and  H-D 
exchange  can  be  found  in  the  work  by  Narhi  and  colleagues,  in  their  study  on  the 
effect  of  mutations  on  the  propensity  of  oc-synuclein  to  aggregate  [124].  They 
showed  that  the  familial  Parkinson's  disease  mutations  resulted  in  more  flexible 
proteins  and  accelerated  a-synuclein  aggregation  based  on  the  increase  in  their  H-D 
exchange  rates.  FTIR  and  H-D  exchange  were  used  by  Baenziger  and  Methot  [125] 
to  study  nicotine  acetylcholine  receptor.  They  found  that  the  a-helical  secondary 
structure  in  the  protein  is  buried,  resistant  to  exchange,  and  likely  composed  of  the 
transmembrane  domains.  Using  FTIR  and  H-D  exchange,  Rath  et  al.  showed  pho- 
toactivation  of  rhodopsin  causes  an  increase  in  hydrogen-deuterium  exchange  rates 
of  buried  peptide  groups  as  evidenced  by  the  extent  of  hydrogen-deuterium 
exchange  of  the  backbone  peptide  groups  [126].  French  and  coauthors  investigated 
the  influence  of  trehalose  and  humidity  on  the  conformation  and  hydration  of  spray- 
dried  recombinant  human  G-CSF  and  recombinant  consensus  interferon- alpha 
(rConlFN)  using  FTIR  and  H-D  exchange  [127].  They  revealed  that  trehalose  had 
a  protective  effect  on  the  secondary  structure  of  the  protein  and  the  stabilization  of 
the  proteins  at  33  %  RH  (relative  humidity). 

Yu  et  al.  also  applied  FTIR  and  H-D  exchange  to  study  the  mechanism  of  cAMP 
activation  [128].  The  authors  demonstrated  that  in  contrast  to  the  cAMP-dependent 
protein  kinase,  binding  of  cAMP  to  Epac  does  not  induce  significant  changes  in 
overall  secondary  structure  and  structural  dynamics  of  the  classic  intracellular 
cAMP  receptor.  Additionally  Kamerzell  and  Middaugh  [129]  employed  two- 
dimensional  correlation  FTIR  spectroscopy  and  H-D  exchange  to  monitor  the  time- 
dependent  structural  changes  of  an  IgGl  as  a  function  of  pH  and  revealed  coupled 
immunoglobulin  regions  of  differential  flexibility  that  influence  the  stability  of  the 
antibody.  This  is  only  a  small  sampling  of  the  great  potential  for  applying  FTIR  and 
H-D  exchange  in  the  life  sciences. 

Protein  Folding  and  Stability 

The  sensitivity  of  FTIR  spectroscopy  in  analyzing  protein  structure  and  dynamics 
has  led  to  numerous  applications  in  studying  protein  folding  and  stability.  Fabian 
and  Naumann  [130]  reviewed  the  use  of  stopped-flow  FTIR  for  initiating  and  moni- 
toring protein  folding  processes  on  the  millisecond  to  minute  timescale.  The  unfold- 
ing and  refolding  of  ribonuclease  A  under  various  conditions  such  as  temperature, 
pressure,  pH,  and  denaturant  has  been  extensively  studied  by  a  number  of  scientists 
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using  FTIR  [100,  131,  132].  FTIR  offers  the  advantage  of  monitoring  the  kinetics  of 
different  secondary  structure  components  simultaneously,  and  assessing  the  order 
of  the  events,  helping  to  identify  which  regions  of  the  protein  unfold  or  fold  first. 
Sethuraman  and  Belfort  [133]  used  FTIR-ATR  in  the  amide  I  region  to  demonstrate 
that  globular  proteins,  such  as  hen  egg  lysozyme  in  phosphate-buffered  saline  at 
room  temperature,  lose  native  structural  stability  and  activity  when  adsorbed  onto 
well-defined  homogeneous  solid  surfaces.  The  structural  loss  and  change  is  shown 
by  the  transition  of  a-helix  to  turns  or  random  coil  followed  by  a  slow  oc-helix  to 
P-sheet  transition,  as  evidenced  by  the  changes  in  the  characteristic  band  frequen- 
cies and  intensities  of  these  different  secondary  structure  components.  Recently, 
Sharma  et  al.  used  FTIR  to  study  thermal  and  structural  stability  of  adsorbed  pro- 
teins [134].  They  showed  that  proteins  adsorbed  to  hydrophobic  surfaces  at  low 
coverage  are  stabilized  relative  to  the  bulk  while  at  larger  coverage  proteins  unfold 
and  form  P- sheets.  These  results  demonstrate  the  utility  of  the  FTIR  spectroscopic 
technique  for  the  characterization  of  protein  folding  and  stability. 

3.2.4.3    Data  Analysis 

Data  analysis  of  FTIR  spectra  has  evolved  over  the  years  from  a  qualitative  assess- 
ment of  the  changes  in  band  frequencies  and  intensities  to  more  complicated  math- 
ematical algorithms,  especially  for  the  study  of  changes  in  protein  conformation. 
Because  the  protein  FTIR  spectrum  consists  of  overlapping  bands  the  features  are 
quite  broad.  Fourier  self-deconvolution  and  derivative  methods  [89,  98-100,  133] 
have  been  used  to  resolve  the  contributions  of  the  various  types  of  secondary  struc- 
ture. This  allows  the  estimation  and  quantification  of  the  changes  in  the  individual 
components.  Fourier  self-deconvolution  and  second  derivative  algorithms  are  the 
most  commonly  used  methods  for  identifying  the  presence  of  different  types  of 
protein  secondary  structure  and  monitoring  changes  in  the  protein  secondary  struc- 
ture. Many  of  the  examples  described  above  made  use  of  these  methods. 

Two-dimensional  correlation  FTIR  spectroscopy  is  another  method  that  was 
developed  to  improve  the  resolution  for  detecting  spectral  and  structural  changes 
[135-137].  In  the  last  few  years  this  method  has  been  applied  with  increasing  fre- 
quency to  studies  of  protein  structure  and  dynamics  as  a  function  of  external  condi- 
tions such  as  pH,  temperature,  etc.  Kamerzell  and  Middaugh  [129]  used 
two-dimensional  correlation  FTIR  spectroscopy  to  monitor  the  time  evolution  of 
hydrogen-deuterium  exchange  of  an  IgGl  as  a  function  of  pH.  They  observed  dif- 
ferential flexibility  of  various  immunoglobulin  regions  in  response  to  an  external 
perturbation.  Sosa  and  coauthors  [138]  used  the  method  to  study  the  structure  of  the 
protein  centrin  and  its  interaction  with  melittin.  The  authors  showed  that  two- 
dimensional  correlation  FTIR  spectroscopy  enabled  the  determination  of  the 
increased  helicity  of  melittin,  demonstrated  that  the  centrin  was  stabilized  within 
the  complex,  and  provided  a  complete  molecular  description  of  the  formation  of  the 
centrin-melittin  complex  and  the  dissociation  process. 

Other  mathematical  algorithms  such  as  the  correlation  coefficient,  the  spectral 
area  of  overlap  and  the  derivative  correlation  function  [94,  139-141]  have  been 
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developed  and/or  applied  to  compare  and  quantify  the  overall  changes  in  FTIR 
spectra  as  a  result  of  external  conditions.  D' Antonio  and  colleagues  [46]  compared 
the  FT-IR  spectra  by  using  several  mathematical  algorithms  including  the  spectral 
correlation  coefficient,  area  of  overlap,  derivative  correlation,  and  a  modified  area  of 
overlap  method.  The  authors  showed  that  all  four  algorithms  were  able  to  detect 
significant  differences  between  samples  and  that  the  results  from  all  four  algorithms 
were  consistent  with  visual  assessments  by  expert  spectroscopists. 

3.2.4.4    Concluding  Comments  on  FTIR 

FTIR  spectroscopy  is  a  versatile  technique  that  has  been  used  widely  in  the  life  sci- 
ences for  the  study  of  protein  structure,  dynamics,  and  stability.  The  main  advantage 
of  FTIR  over  other  spectroscopic  techniques  such  as  UV  CD  and  fluorescence  is 
that  FTIR  can  be  used  to  study  samples  in  any  form,  including  liquids,  solids,  and 
even  gases.  The  primary  weakness  of  the  technique  is  the  need  for  relatively  high 
protein  concentration  for  the  study  of  protein  conformation  as  compared  to  other 
types  of  analyses.  With  the  continued  evolution  of  both  more  sensitive  instruments 
and  improved  mathematical  approaches,  this  technique  will  continue  to  see  growing 
applications  in  the  life  sciences. 


3.2.5  Raman 

3.2.5.1  Theory 

Raman  spectroscopy  measures  inelastic  scattering,  or  Raman  scattering,  of  mono- 
chromatic light  based  on  interactions  of  laser  light  with  molecular  vibrations,  pho- 
nons,  or  other  excitations  in  the  system,  shifting  the  energy  of  the  laser  photons  up  or 
down.  The  shift  in  energy  gives  information  about  the  vibrational  modes  in  the  sys- 
tem. Raman  scattering  is  sensitive  to  the  changes  of  polarizability  of  the  molecule  or 
chemical  bond.  The  polarizability  represents  the  ability  of  an  applied  electric  field  to 
induce  a  dipole  moment  in  a  molecule  or  chemical  bond  without  changing  the  inter- 
nuclear  separation.  The  intensity  and  frequency  of  the  Raman  scattering  are  deter- 
mined by  the  amount  of  the  induced  dipole  moment  changes  and  the  energy  stage  of 
the  chemical  bond  respectively.  Therefore  this  technique  can  be  used  to  study  the 
vibrational,  rotational,  and  other  low-frequency  modes  of  a  molecular  system.  The 
Raman  spectrum  of  protein  or  DNA  consists  of  numerous  discrete  bands  representing 
the  normal  molecular  modes  of  vibration  and  serves  as  a  sensitive  and  selective  finger- 
print of  three-dimensional  structure  (including  secondary  and  tertiary  structures) 
intermolecular  interactions,  and  dynamics.  For  a  protein  molecule,  the  stretching 
vibrations  of  the  peptide  C=0  and  C-N  bonds  yield  high  Raman  intensities  because 
of  the  large  polarizability  changes  associated  with  these  vibrations.  In-plane  vibra- 
tions of  the  rings  of  the  aromatic  side  chains  of  Trp,  Tyr,  Phe  also  produce  high  inten- 
sity Raman  bands.  Stretching  vibrations  of  C-C,  C-N,  and  C-0  bonds  will  also 
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Table  3.3  Raman  amide  I  and  amide  III  band  different  structure  of  proteins  or  polypeptides 
[142-149] 


If    1  1 

Molecule 

A        'J      T  /          — 1  \ 

Amide  I  (cm  L) 

A         '  J       TTT   /          —  1  \ 

Amide  111  (cm  L) 

O                        J                       A-  A- 

Secondary  structure 

a-Poly-L- alanine 

1,655 

1,264-1,348 

a-Helix 

oc-Poly-L-glutamate 

1,652 

1,290 

a-Helix 

a-Poly-L-lysine 

1,645 

1,295-1,311 

a-Helix 

P-Poly-L-alanine 

1,669 

1,226-1,243 

P-Strand 

P-Poly-L-glutamate 

1,672 

1,236 

P-Strand 

P-Poly-L-lysine 

1,670 

1,240 

P- strand 

Poly-L-lysine,  pH  4 

1,665 

1,243-1,248 

Irregular 

Poly-L-glutamate,  pH  1 1 

1,656 

1,249 

Irregular 

Fc 

1,670 

1,240 

P-Strand 

a-Helical  protein 

1,652 

1,276-1,327 

a-Helix 

Protein-2 

1,671 

1,240 

produce  intense  Raman  bands.  Vibrations  that  involve  the  displacement  of  heavy 
atoms  such  as  sulfur  in  the  C-S  stretching  modes  of  methionine  (Met)  and  cysteine 
(Cys),  and  the  S-S  stretching  of  cystine,  the  S-H  stretching  of  Cys,  and  the  Zn-S 
stretching  in  zinc  metallo-proteins  are  relatively  intense  as  well  [142-159]. 

The  correlation  between  Raman  band  intensities  and  frequencies  in  a  protein 
spectrum  to  the  local  environment  and  structure  of  the  peptide  bond  or  specific  side 
chains  are  detailed  below. 


Peptide  Bond 

The  band  shapes  and  peak  positions  of  the  amide  I  region  (1,640-1,680  cm-1), 
which  is  primarily  a  carbonyl  stretching  mode,  and  the  amide  III  signal  (1,230- 
1,310  cm-1),  which  combines  both  in-plane  C-N-H  bending  and  C-N  stretching 
motions,  are  very  sensitive  to  changes  in  the  protein  secondary  structure  and  are 
often  used  as  indicators  of  protein  secondary  structure  integrity.  Table  3.3  lists  the 
assigned  Raman  amide  I  and  amide  III  bands  assigned  to  the  different  secondary 
structure  of  representative  polypeptides  and  proteins  [142-149]. 

Tyr 

The  phenyl  ring  of  Tyr  can  generate  a  pair  of  relatively  intense  Raman  bands  at  850 
and  830  cm-1.  The  intensity  ratio  of  the  Tyr  doublet  fecZ^o)  is  related  to  the  hydro- 
gen bonding  state  of  the  phenolic  OH  group  of  this  amino  acid.  When  the  phenolic 
OH  group  is  a  strong  hydrogen-bond  donor,  /850//83o  =  0.3;  when  the  phenolic  OH 
group  is  a  strong  hydrogen-bond  acceptor,  IS5Q/IS3Q  =  2.5;  and  when  the  phenolic 
OH  group  functions  as  both  a  donor  and  an  acceptor,  IS50/IS3Q=  1.25  [150-152]. 
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Trp 

Trp  generates  many  Raman  bands  [153-156]  which  have  been  correlated  to  the 
local  environment  and  geometry  of  the  Trp  side  chain  in  proteins.  An  intense  and 
sharp  Raman  band  in  the  spectrum  between  1,540  and  1,560  cm-1  is  the  indicator  of 
the  absolute  value  of  the  side  chain  torsion  angle  x2,1  (C51-Cy-C^-Ca).  This  aromatic 
amino  acid  also  generates  a  Fermi  doublet  with  components  at  1,360  and  1,340  cm-1. 
The  intensity  ratio  of  the  Fermi  doublet  360/^1,340)  increases  with  increasing  hydro- 
phobicity  of  the  indolyl  ring  environment  and  thus  serves  as  an  indicator  of  local 
hydropathy.  A  Raman  band  near  880  cm-1  is  sensitive  to  indolyl  N-H  hydrogen- 
bond  donation  and  shifts  to  lower  frequency  with  increasing  strength  of  N-H---X 
hydrogen  bonding.  An  indole  ring-breathing  vibration  of  Trp  generates  an  intense 
band  (near  755  cm-1)  in  the  Raman  spectra  of  proteins.  This  band  intensity  increases 
with  decreasing  hydrophobicity  of  the  indolyl  ring  environment. 

Cys 

The  Raman  band  resulting  from  the  Cys  sulfhydryl  bond  (S-H)  stretching  vibration 
occurs  in  the  Raman  region  of  2,500-2,600  cm-1.  The  S-H  stretching  vibrational 
band  is  sensitive  to  the  hydrogen-bond  interactions  of  the  S-H  group;  the  frequencies 
of  S-H  at  different  H-bonding  states  are  listed  in  Table  3.4  [157-159].  The  intensity 
of  this  band  is  a  direct  measure  of  the  concentration  of  cysteinyl  S-H  in  a  protein. 

His 

In  D20  solution,  histidine  has  a  moderately  intense  Raman  band  near  1,408  cm-1, 
which  is  assigned  to  an  in-plane  vibration  of  the  N-deuterated  imidazolium  ring 
[160,  161]. 


Table  3.4  Raman  S-H  frequency  and  hydrogen  bonding  states  [157-159] 


Hydrogen  bonding 

S-H  band 

S-H  band 

state  of  S-H  group 

frequency  (cm-1) 

width  (cm-1) 

Examples 

No  hydrogen  bond 

2,581-2,589 

12-17 

Thiols  in  CC14  (dilute  solution) 

S  acceptor 

2,590-2,595 

12-17 

Thiols  in  CHC13 

Weak  S-H  donor 

2,575-2,580 

20-25 

Thiol  neat  liquids;  thiols  in  thioesters 

Moderate  S-H  donor 

2,560-2,575 

25-30 

Thiols  in  acetone;  crystal  structures 

Strong  S-H  donor 

2,525-2,560 

35-60 

Thiols  in  dimethylacetamide;  crystal 

structures 

S-H  donor  and  S 

2,565-2,575 

30-40 

Thiols  in  H20 

acceptor 
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DNA 

For  a  DNA  molecule,  different  nucleotide  conformations  and  secondary  structures 
will  exhibit  different  characteristic  Raman  bands.  Characteristic  carbonyl  modes  in 
the  1,600-1,700  cm-1  region  reflect  differences  in  base  pair  hydrogen  bonding  of  the 
respective  GC  complexes.  The  intense  Raman  lines  of  the  phosphodiester  backbone 
in  the  750-850  cm-1  region  are  the  most  useful  for  qualitative  identification  of  B-, 
Z-,  and  A-forms  of  DNA  structures. 

3.2.5.2  Applications 
Protein  Conformation 

The  Raman  spectrum  of  a  protein  usually  contains  signals  from  a  mixture  of 
oc-helix,  P-sheet,  and  random  structures,  as  well  as  contributions  from  the  aro- 
matic amino  acids  and  Cys  as  described  above.  Raman  band  frequencies  and 
shapes  can  be  used  to  analyze  protein  secondary  structures,  determine  side 
chain  conformations,  and  detect  intra-  and  intermolecular  interactions,  includ- 
ing the  formation  of  specific  complexes  and  large  assemblies  and  the  amount  of 
free  S-H.  Thus  one  can  obtain  information  about  the  secondary  structure  con- 
tent of  a  protein  as  well  as  information  about  the  tertiary  structure  from  the  same 
spectrum,  obtained  under  the  same  conditions.  Figure  3.15  shows  an  FT-Raman 
spectrum  of  an  E.  c6>//-derived  Fc  domain  at  pH  7,  with  most  of  the  characteris- 
tic Raman  bands  labeled. 

Raman  spectroscopy  is  very  useful  for  protein  secondary  structure  analysis, 
complementing  the  information  one  gets  from  FTIR  or  far  UV  CD  analysis.  The 
Raman  spectra  of  two  proteins  can  have  the  same  peak  position  but  different  shapes 
for  the  amide  I  band,  due  to  differences  in  the  percentages  of  a-helix,  P-sheet,  and 
random  structure  in  the  secondary  structures  of  the  proteins.  This  can  be  determined 
by  deconvolution  of  the  spectra  in  this  region. 

Tertiary  structure  can  be  probed  with  Raman  spectroscopy  by  monitoring  several 
amino  acid  side  chain  conformations  and  the  environments  in  which  they  are 
located.  A  number  of  correlations  have  been  established  between  Raman  band 
intensities  and/or  frequencies  of  Cys,  Tyr,  Trp,  and  His  residues  and  the  local  envi- 
ronments or  structures  of  these  specific  side  chains,  as  detailed  in  the  theory  subsec- 
tion above,  and  changes  in  protein  conformation  can  be  monitored  by  changes  in 
these  signals.  Raman  spectra  can  be  useful  for  monitoring  His  H/D  exchange 
dynamics  as  well  as  histidine-histidinium  equilibria  (in  D20).  A  linear  correlation 
between  the  frequency  of  the  1 ,408  cm-1  Raman  band  and  the  His  side  chain  con- 
formation has  been  proposed. 

One  important  aspect  of  Raman  spectroscopy  of  proteins  is  the  ability  to  detect 
free  (reduced)  Cys.  Because  the  Raman  band  is  specific  to  the  S-H  group  its  inten- 
sity is  a  direct  measure  of  the  concentration  of  cysteinyl  S-H  sites,  and  it  can  be 
used  to  determine  the  H-bonding  states  and  amount  of  free  S-H  in  a  protein,  and 
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Fig.  3.15  Raman  spectrum  of  mAb  in  400-3,500  cm  1 


also  to  measure  the  pKa  of  thiolate  titration.  This  can  be  very  useful  in  following  the 
oxidation  and  formation  of  disulfide  bonds. 

In  Raman  difference  spectroscopy,  a  digitally  computed  spectrum  obtained  by 
subtracting  one  Raman  spectrum  from  another,  can  be  very  useful  in  visualizing  the 
difference  in  the  Raman  band  shape  and/or  frequency  changes  between  samples, 
and  in  increasing  sensitivity  in  detecting  conformational  changes  in  a  protein. 

Protein  Dynamics 

Ultraviolet  resonance  Raman  (UVRR)  spectroscopy  uses  selective  excitation  in  the 
UV  absorption  bands  of  macromolecules  to  produce  spectra  of  particular  chromo- 
phoric  segments.  Resonance  excitation  has  the  advantage  of  selectivity  in  the  transi- 
tions being  targeted  for  study,  and  has  become  a  powerful  tool  for  protein  folding 
studies,  via  enhancement  of  the  amide  vibrations  of  the  polypeptide  backbone  by 
the  use  of  deep  UV  excitation  [162,  163].  Dual  wavelength  UVRR  allows  both  pep- 
tide and  aromatic  spectra  to  be  obtained  simultaneously  [164].  Due  to  its  sensitivity 
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and  structural  selectivity,  UVRR  spectroscopy  can  provide  unique  insight  into  protein 
dynamics.  Selective  excitation  (-230  nm)  of  Trp  and  Tyr  residues  can  monitor  the 
details  of  the  dynamic  flexibility  of  the  tertiary  structure  of  a  protein  and  the  struc- 
tural changes  involved.  When  changes  in  secondary  structure  occur,  as  in  protein 
folding,  these  can  be  elucidated  via  excitation  (-200  nm)  of  the  amide  backbone. 
H-D  exchange  Raman  spectroscopy  can  also  be  used  to  study  protein  flexibility  and 
folding  [165-168]. 

DNA  Conformation 

Raman  spectroscopy  is  very  useful  for  DNA  structure  analysis.  A  conformational^ 
sensitive  guanine  mode,  which  yields  Raman  bands  near  682,  668,  or  625  cm-1  in  B 
(C2'-endo,  anti),  A  (C3'-endo,  anti),  or  Z  (C3'-endo,  syn)  structures,  respectively,  is 
the  most  useful  for  quantitative  analysis.  In  D20,  the  guanine  band  of  Z-DNA  is 
shifted  to  615  cm-1,  permitting  its  detection  even  in  the  presence  of  proteins.  These 
conformationally  sensitive  Raman  bands  can  be  employed  to  identify  the  nucleic 
acid  secondary  structure  within  a  capsid  or  nucleoproteins  [169,  170]. 

Deconvolution  of  Raman  Spectra 

Like  most  spectroscopic  techniques,  Raman  spectroscopy  captures  the  distribution 
of  conformations  and  thus  reflects  the  average  conformation.  The  most  commonly 
used  quantitative  analysis  method  by  researchers  in  the  field  is  the  deconvolution  of 
the  Raman  spectrum  of  a  protein  or  DNA  to  the  contributions  of  individual  struc- 
tural elements  (a-helix,  P-sheet,  P-turn  and  unordered  conformation  for  a  protein  or 
B-,  A-,  and  Z-forms  for  a  DNA)  followed  by  the  determination  of  the  percentage  of 
each  component  in  the  protein  secondary  structure  or  DNA  form  [171].  This  is  used 
primarily  for  secondary  structure  determination. 

Raman  Microscopy 

Raman  microscopy  is  a  powerful  technique  for  solid-state  analysis  of  protein  phar- 
maceuticals; it  can  be  applied  in  situ  within  glass  containers  or  on  the  isolated  par- 
ticle samples.  The  laser  beam  of  a  confocal  Raman  microscope  can  be  focused 
sharply  on  small  particles  (on  the  order  of  20  um  or  larger)  to  obtain  high  quality 
Raman  spectra.  The  penetration  of  visible  laser  light  through  glass  enables  in  situ 
analysis  to  be  performed  without  any  sample  manipulation  [172,  173].  The  charac- 
teristic fingerprint  of  Raman  protein  spectra  allows  differentiation  between  protein 
product  and  placebo  and  between  different  protein  products.  Raman  microscopy  has 
been  used  to  differentiate  drug  counterfeits  from  authentic  products  and  employed 
in  investigations  of  regulatory  nonconformance  to  quickly  confirm  the  presence 
and  identity  of  foreign  particles  in  primary  glass  containers  by  in  situ  analysis. 
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Some  applications  of  Raman  spectroscopic  analysis  for  protein  pharmaceuticals, 
solutions  and  solids,  in  primary  glass  containers  are  described  by  Wen  and  col- 
leagues [172].  This  included  protein  gelation,  protein  product  identification,  and  in 
situ  analysis  of  residues  in  primary  containers  as  well  as  investigating  manufactur- 
ing incidents  during  drug  product  filling  operations  [172]. 

3.2.5.3  Emerging  Raman  Technologies 

Applications  of  Raman  spectroscopy  to  the  life  sciences  are  rapidly  increasing,  in 
part  as  a  result  of  the  rapid  evolution  of  the  instrumentation  itself.  One  relatively 
new  innovation  is  the  nondestructive  point- and- shoot  handheld,  lightweight  Raman 
system  that  is  being  widely  used  for  rapid  raw  material  and  drug  product  verifica- 
tion. The  system  enables  analysis  through  sealed  packaging  to  minimize  risk  of 
exposure  and  contamination  of  the  material  being  tested  and  its  embedded  analysis 
software  delivers  a  PASS/FAIL  decision  verifying  the  identity  of  a  sample. 

There  are  a  number  of  advanced  types  of  Raman  spectroscopy  under  develop- 
ment, including  surface-enhanced  Raman  (SERS)  where  the  Raman  signals  are 
enhanced  when  molecules  bind  to  the  surface  of  metal  in  a  variety  of  morphologies. 
Tip-enhanced  Raman  is  an  alternative  to  conventional  SERS  where  a  modified  AFM 
tip  is  brought  into  contact  with  a  sample  surface.  Polarized  Raman  spectroscopy  can 
probe  information  about  molecular  orientation  and  symmetry  of  the  bond  vibra- 
tions, in  addition  to  the  general  chemical  identification  which  conventional  Raman 
provides.  Stimulated  Raman  is  a  technique  being  explored  where  both  the  signal 
Raman  beam  inducing  the  inelastic  scattering  and  a  pump  Raman  beam  in  an  optical 
medium  are  applied  to  the  samples  at  the  same  time. 

Raman  optical  activity  (ROA)  is  a  vibrational  spectroscopic  technique  comple- 
mentary to  VCD,  similar  to  the  way  Raman  absorption  spectroscopy  is  complemen- 
tary to  FTIR,  and  is  especially  useful  in  the  spectral  region  between  50  and 
1,600  cm-1.  It  is  considered  the  technique  of  choice  for  determining  optical  activity 
for  photon  energies  less  than  600  cm-1  [174]. 

3.2.5.4  Concluding  Comments  on  Raman  Spectroscopy 

The  Raman  spectrum  of  a  protein  contains  information  on  both  the  secondary  and 
tertiary  structure  of  the  molecule.  This  includes  the  secondary  structure  of  the  pro- 
tein backbone  and  the  amount  of  a-helix,  P-sheet  and  irregular  structures  present,  as 
well  as  the  state  of  hydrogen  bonding,  the  conformation  and  configuration  of  local 
environments  of  side  chains,  and  intermolecular  interactions.  It  can  also  be  used  to 
determine  the  different  helical  forms  of  DNA.  This  technique  is  also  very  useful  for 
studying  protein  and  nucleic  acid  assemblies.  An  important  advantage  of  Raman 
over  IR  spectroscopy  for  applications  to  proteins  and  their  complexes  is  the  virtual 
transparency  of  liquid  water  in  this  type  of  spectroscopy.  This  greatly  simplifies  the 
analysis  of  spectra  of  aqueous  solutions  of  biomolecules.  Another  advantage  of 
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Raman  spectroscopy  is  that  Raman  can  be  used  for  samples  in  different  physical 
states  (solutions,  suspensions,  gels,  precipitates,  fibers,  single  crystals,  amorphous 
solids,  etc.).  However,  in  general,  the  Raman  instrument  is  more  complex  than  the 
FTIR  instrument,  and  it  is  more  problematic  to  compare  quantitatively  the  scatter- 
ing intensities  of  Raman  bands,  whereas  IR  absorbance  intensities  are  governed  by 
Beer's  Law.  The  Raman  signal  is  also  typically  weaker  than  the  corresponding  IR 
signal  of  the  same  sample,  meaning  that  higher  concentrations  of  the  macromole- 
cule  being  analyzed  are  required. 


3.2. 6   Light  Scattering 

3.2.6.1  Theory 

In  addition  to  absorption  and  emission,  light  scattering  is  another,  more  complex, 
interaction  between  light  and  macromolecules,  as  described  in  the  theory  section  of 
this  chapter.  There  are  two  common  types  of  light  scattering  analyses  that  are  used 
in  the  life  sciences:  static  light  scattering  and  dynamic  light  scattering  (DLS). 

In  static  light  scattering  a  high  intensity  monochromatic  light,  most  often  a  laser, 
of  appropriate  wavelength  is  passed  through  a  solution  of  interest  and  the  resulting 
time-averaged  scattered  light  intensity  is  collected  at  multiple  different  angles.  This 
technique  is  commonly  known  as  multiple  angle  light  scattering  (MALS);  or  more 
recently  as  multiple  angle  laser  light  scattering  (MALLS)  with  a  laser  used  as  the 
light  source.  In  order  to  detect  the  scattered  light  at  multiple  different  angles  the 
wavelength  of  light  should  be  at  least  50  times  greater  than  the  macromolecular  spe- 
cies (usually  protein)  doing  the  scattering.  To  this  end  laser  light  of  690  nm  is  typi- 
cally used  for  MALS/MALLS  experiments.  When  the  particles  are  closer  in  size  to 
the  incident  wavelength  the  interactions  between  light  and  the  particles  become 
more  complex,  with  forward  scattering  being  the  most  significant  component  of  the 
scattered  light,  an  event  also  referred  to  as  Mie  scattering.  Because  of  this  phenom- 
enon a  few  large  particles  can  skew  the  results  and  make  the  size  or  hydrodynamic 
radius  of  the  solution  appear  to  be  larger  than  it  is.  SLS  can  be  used  to  determine  the 
molecular  weight  of  the  protein  being  analyzed  as  a  function  of  solution  condition, 
as  long  as  the  concentration  is  low  enough  to  prevent  intermolecular  interaction  that 
could  affect  the  scattering  behavior, 

DLS,  also  referred  to  as  quasi  elastic  light  scattering  (QELS),  measures  the  fluc- 
tuations in  the  intensity  of  scattered  light  from  a  collection  of  molecules  in  the 
sample  as  a  function  of  time.  This  fluctuation  in  intensity  is  caused  by  the  constant 
motion  of  the  molecules  in  solution  due  to  Brownian  motion  and  intermolecular 
interactions.  It  is  used  to  calculate  the  average  size  of  the  molecules,  including  their 
hydration  shells,  doing  the  scattering;  in  other  words  the  hydrodynamic  radius. 
Most  modern  instruments  use  autocorrelation  analysis  of  the  light  fluctuations 
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where  the  intensity  at  any  time  is  compared  to  the  intensity  at  time + time  interval  x. 
The  autocorrelation  function  is  then  defined  as 

g<2»(T)  =  l  +  ce-2"2DT  (3.14) 

where  x  is  the  time  interval  of  the  measurement,  D  is  the  translational  diffusion 
coefficient,  h  =  4%nX~1  sin  (0/2)  where  0  is  scattering  angle  and  n  is  the  refractive 
index.  The  autocorrelation  data  is  analyzed  using  various  mathematical  methods 
including  the  cumulant  method,  the  CONTIN  algorithm  or  the  maximum  entropy 
method.  The  cumulant  analysis  method  is  currently  defined  in  the  ISO  standards 
and  is  included  with  the  instruments  by  major  manufacturers.  The  primary  result  of 
a  DLS  experiment  is  the  averaged  diffusion  coefficient  D.  From  the  diffusion  coef- 
ficient the  size  of  a  molecule,  defined  by  the  hydrodynamic  radius  rh,  can  be  calcu- 
lated and  with  some  further  assumptions  the  molar  mass  can  be  estimated.  For  the 
diffusion  of  spherical  particles  the  Stokes-Einstein  equation  can  be  used  to  calcu- 
late hydrodynamic  radius  from  the  diffusion  coefficient 

_    kT      kT  /Q  !  Cx 

D  =  —  =   (3.15) 

where  k  is  the  Boltzmann  constant,  T  is  the  absolute  temperature,  r|  is  the  solution 
viscosity,  and  the  frictional  coefficient  /  is  defined  by  the  Stokes  Equation  as 
f=6%r\rh.  Application  of  these  principles  to  macromolecular  analysis  is  detailed 
below. 


3.2.6.2  Applications 
Static  Light  Scattering 

The  primary  application  of  static  light  scattering  is  in  the  determination  of  the 
molecular  weight  of  the  macromolecules  in  a  relatively  dilute  solution  of  interest,  in 
order  to  avoid  non-ideality  [175].  High-throughput  light  scattering  techniques  are 
preferred  in  formulation  development  [176]  in  order  to  rapidly  compare  multiple 
formulation  candidates.  Static  light  scattering  can  be  used  to  determine  the  molecu- 
lar weight  of  species  eluting  from  chromatography  columns.  In  this  application  a 
light  scattering  detector  is  coupled  with  a  concentration  detector,  most  commonly  a 
UV  absorbance  and/or  refractive  index  detector  [59,  177].  This  technique  has 
become  a  very  important  tool  to  characterize  the  components  of  drug  products,  and 
other  macromolecular  samples.  An  example  of  a  typical  SEC  analysis  of  multiple 
oligomeric  species  is  illustrated  in  Fig.  3.16. 

Characterization  of  eluting  species  by  this  SLS  has  also  proven  to  be  a  useful  tool 
in  situations  where  the  kinetics  of  aggregation  or  heterogeneous  protein  interactions 
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time  (min) 

Fig.  3.16  Size  exclusion  chromatography  of  heat-stressed  human  serum  albumin,  with  the  molec- 
ular weights  as  determined  using  MALLS  show  above  each  peak 


are  being  studied,  or  where  ligand  binding  is  present.  In  these  experiments  different 
proteins  of  interest  are  mixed  in  varying  ratios,  and  changes  in  the  molecular  weight 
of  the  eluting  species  are  analyzed. 

Weaker  types  of  protein  self-association  can  also  be  studied  using  static  light 
scattering,  If  attractive  or  repulsive  forces  between  molecules  occur  results  in  ther- 
modynamic non-ideality,  then  the  equation  RQ  =  KMc  can  be  modified  to  account  for 
non-ideality 

—  =  —  +  2Bc  (3.16) 
RQ  M 

where  B  is  the  second  virial  coefficient,  a  measure  of  protein  non-ideality.  The  abso- 
lute value  of  B  represents  the  strength  of  the  interaction;  a  negative  value  indicates 
attraction  and  a  positive  value  indicates  repulsion.  The  second  virial  coefficient  is 
typically  determined  by  measuring  the  light  scattering  intensity  of  a  series  of  dilu- 
tions; the  maximum  concentration  must  be  in  the  concentration  range  where  signifi- 
cant protein  non-ideality  is  present. 


Dynamic  Light  Scattering 

DLS  is  most  frequently  used  to  determine  size  distribution  of  macromolecules  and 
to  study  protein  aggregation  in  solution  [178,  179].  A  typical  hydrodynamic  radius 
determination  by  DLS  is  presented  in  Fig.  3.17. 
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delay  time  (Tau)  (sec)  hydrodynamic  radius  (nm) 

Fig.  3.17  Determination  of  the  hydrodynamic  radius  of  an  IgG.  (a)  Autocorrelation  function  fit- 
ting, (b)  hydrodynamic  radius  distribution,  the  rh  of  the  monomer  was  determined  to  be  5.3  nm 


This  technique  is  attractive  as  it  can  cover  a  large  size  range,  requires  very  little 
material,  is  nondestructive,  and  can  be  conductive  in  a  high  throughput  configuration. 
DLS  is  increasingly  being  implemented  during  protein  development  as  a  powerful 
tool  in  studies  on  the  effect  of  solution  conditions,  formulation  components,  stress, 
and  forced  degradation  studies  on  the  size  and  self-association  status  of  the  protein. 
The  results  are  used  to  inform  candidate  selection,  and  process  and  formulation 
conditions  [180]. 

Since  DLS  measures  the  hydrodynamic  radius,  it  is  sensitive  to  changes  in  shape 
as  well  as  size.  This  can  be  used  to  follow  perturbations  in  the  protein  structure, 
changes  in  conformation  and  unfolding.  An  example  of  changes  in  hydrodynamic 
radius  due  to  protein  unfolding  is  shown  in  Fig.  3.18. 

Information  on  the  homogeneity,  or  polydispersity,  of  a  sample  can  also  be 
obtained  from  DLS  measurements.  A  polydisperse  solution  will  result  in  a  multi- 
exponential  decay  in  the  autocorrelation  function.  An  increase  in  the  polydispersity 
even  though  a  single  molecular  weight  is  detected  suggests  either  the  presence  of 
small  amounts  of  larger  species,  or  the  presence  of  other  differently  shaped  mole- 
cules (most  likely  from  unfolding  and  structural  perturbations)  with  the  same 
molecular  weight.  The  sensitivity  of  this  method  increases  with  size  and  DLS  is 
therefore  capable  of  detecting  the  presence  of  extremely  small  amounts  of  particles 
in  the  submicron  range  which  are  often  difficult  to  detect  by  other  techniques.  The 
practical  upper  limit  for  DLS  measurements  is  approximately  1  um,  for  larger  par- 
ticles other  techniques,  such  as  light  obscuration  must  be  used. 

DLS  can  also  be  used  to  study  the  weak  self-associations  of  macromolecules, 
by  probing  the  effect  of  protein  concentration  on  the  diffusion  coefficient.  When 
interactions  are  occurring  the  solution  will  show  nonideal  behavior,  and  the  diffu- 
sion coefficient  measured  (Dm)  will  change  with  protein  concentration  until  it  is 
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Fig.  3.18  Unfolding  of 
cytochrome  c  in  the  presence 
of  denaturing  agent  measured 
by  DLS,  monitoring  the 
increase  in  hydrodynamic 
radius 
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equivalent  to  the  self-diffusion  coefficient  of  the  monomelic  species  (D0)  at  very  dilute 
concentrations.  This  dependence  on  protein  concentration  can  be  represented  as 


kD  can  be  determined  from  measurements  of  the  diffusion  coefficient  in  a  series  of 
protein  dilutions,  and  is  a  measure  of  protein-protein  interaction.  A  negative  kD 
signifies  attraction  and  association  between  molecules  while  a  positive  kD  represents 
repulsion.  This  type  of  analysis  can  be  used  to  assess  the  effects  of  different  solution 
conditions  on  the  association  behavior  of  a  molecule.  DLS  experiments  can  be  per- 
formed at  protein  concentrations  up  to  several  mg/mL. 

Other  Applications  of  Light  Scattering 

Static  light  scattering  measures  the  apparent  molecular  weight  of  the  molecule  while 
DLS  measures  the  hydrodynamic  radius.  Combining  these  two  techniques  allows 
one  to  probe  the  shape  and  the  openness  or  density  of  the  protein  or  protein  aggre- 
gates being  analyzed.  Discrepancies  between  the  determined  molecular  weight  of 
the  molecule  being  studied  (modeled  as  a  sphere)  by  SLS  and  the  size  obtained  with 
DLS  can  be  used  as  an  indication  of  irregular  or  elongated  shape.  For  more  accurate 
shape  determination  analytical  ultracentrifugation  should  be  considered;  this  is  a 
complementary  technique  that  can  be  used  over  a  wide  range  of  concentrations. 

One  practical  application  of  light  scattering  for  the  study  of  macromolecules  is 
the  use  of  turbidity  to  follow  protein  aggregation.  There  are  two  different  methods 
to  measure  turbidity:  determination  of  the  light  loss  of  the  transmitted  beam 


D, 
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(scatter  coefficient)  or  determination  of  the  intensity  of  the  scattered  light.  The  scatter 
coefficient  represents  the  total  scattered  light  that  has  been  withdrawn  from  the 
incident  beam,  while  the  scatter  intensity  corresponds  to  how  much  scattered  light 
has  been  deflected  at  a  given  angle.  Measurements  of  the  scattering  intensity  can 
only  be  performed  at  lower  concentrations  due  to  multiple  scattering,  while  trans- 
mission measurements  can  be  used  at  higher  protein  concentrations.  Another  limit 
for  turbidity  measurements  is  that  they  can  only  be  performed  at  wavelengths  where 
the  chromophores  in  the  macromolecules  do  not  absorb  or  absorb  minimally;  the 
visible  range  is  therefore  most  suitable  but  turbidity  studies  can  be  also  performed 
in  the  near-UV  region  (320-400  nm).  Changes  in  the  signal  at  600  nm  can  be  used 
to  detect  the  beginning  of  aggregation  which  is  often  employed  in  accelerated  stress 
studies  to  assess  stability  of  a  given  molecule  as  a  function  of  storage  or  handling 
conditions.  Turbidity  measurements  can  be  easily  performed  in  a  high-throughput 
experimental  setup  and  is  suitable  for  the  measurement  of  aggregation  kinetics,  to 
assess  relative  stability  to  heat,  pH,  or  any  type  of  accelerated  degradation  study  to 
predict  relative  stability  of  proteins  under  targeted  conditions. 

Other  Techniques 

There  are  several  other  more  specialized  applications  of  light  spectroscopy  in  the 
life  science,  such  as  reflectance  difference  spectroscopy,  small  angle  X-ray  scatter- 
ing, Raman  scattering.  More  detailed  discussions  of  these  types  of  applications  can 
be  found  in  several  reviews  and  books  published  recently  [9,  10]. 

3.2.6.3    Concluding  Comments  on  Light  Scattering 

Both  static  and  dynamic  light  scattering  provide  information  on  size  distribution 
and  hydrodynamic  shape  that  cannot  be  obtained  by  other  spectroscopic  techniques. 
They  are  qualitative  rather  than  quantitative  techniques  that  are  most  commonly 
used  to  assess  the  effect  of  different  solution  conditions  and  stress  on  the  proteins 
being  studied.  SLS  can  be  used  in  conjunction  with  separation  methods  to  identify 
the  molecular  weight  of  the  species  eluting,  and  they  can  both  be  used  to  understand 
macromolecular  self-association. 


3.3    Chapter  Summary 

This  chapter  illustrated  the  many  different  ways  that  the  interactions  of  light  photons 
with  macromolecules  can  be  used  to  gain  information  on  the  structure,  stability, 
interactions,  and  other  properties  of  proteisn  and  nucleic  acids.  Each  of  these  tech- 
niques has  strengths  and  weaknesses,  but  together  they  make  a  potent  tool  box 
which  can  be  used  to  increase  our  understanding  of  the  behavior  of  macromolecules 
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throughout  the  life  sciences.  The  biophysicist  can  select  the  appropriate  methodol- 
ogy for  the  questions  being  addressed,  based  on  the  amount  of  sample  and  time 
available.  Most  of  the  currently  commercially  available  instruments  are  designed  to 
perform  one  type  of  spectrophotometric  technique.  More  advanced  instrumentation 
already  combines  more  techniques,  e.g.,  absorbance  and  fluorescence,  or  static  light 
scattering  with  dynamic  light  scattering.  Wyatt  Corporation,  a  leading  manufacturer 
in  the  field  of  light  scattering  instruments,  integrated  SLS,  DLS,  and  the  measure- 
ment of  zeta  potential  and  electrophoretic  mobility  into  one  instrument  and  refers  to 
this  solution  as  "Massively  Parallel  Phase  Analysis  Light  Scattering."  The  trend  of 
the  future  appears  to  be  integrating  multiple  spectrometric  techniques  into  one 
instrument  allowing  synthesis  of  data  from  multiple  techniques  into  an  "abstract 
picture  of  protein  behavior." 
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Chapter  4 

Diffraction  and  Scattering  by  X-Rays 
and  Neutrons 


Ivan  Rayment 


Abstract  X-ray  and  neutron  scattering  methods  have  played  a  historically  pivotal 
role  in  the  development  of  molecular  biophysics  and  biochemistry.  Today  a  large 
proportion  of  papers  that  address  a  question  in  macromolecular  structure  carry  with 
them  a  pictorial  representation  based  on  structural  features  derived  from  scattering 
methods.  These  images  are  often  compelling  and  yet  it  is  often  difficult  to  assess  the 
presented  information  based  on  pictures  alone.  The  purpose  of  this  chapter  is  to 
allow  the  non-expert  to  develop  a  sense  of  how  much  can  be  accepted  or  what  can 
be  learned  from  the  images  derived  from  scattering  methods. 

Keywords  X-ray  diffraction  •  Neutron  diffraction  •  Small-angle  scattering  • 
Molecular  structure 


4.1    Introduction  to  X-Ray  Scattering  and  Diffraction 
Methods 

4.1.1    Theory  of  X-Ray  Scattering 

X-ray  scattering  can  be  used  to  investigate  many  aspects  of  biological  molecules 
depending  on  the  nature  of  the  interactions  of  the  X-rays  with  the  material  of  interest. 
The  focus  of  this  section  is  on  the  group  of  techniques  that  utilize  elastic  scattering 
of  X-rays  by  electrons.  This  includes  X-ray  crystallography,  fiber  diffraction,  and 
small-angle  scattering. 
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In  the  simplest  terms  X-rays  scattering  arises  from  the  interaction  of  the  electro- 
magnetic wave  of  the  X-rays  with  an  electron.  This  causes  the  electron  to  oscillate, 
which  in  terms  of  classical  electromagnetic  theory  allows  absorption  and  reemis- 
sion  of  the  X-rays.  The  emitted  radiation  will  have  the  same  wavelength  as  the 
incident  wave,  but  is  scattered  in  all  directions.  The  manner  in  which  the  scattered 
radiation  from  one  electron  interacts  with  that  scattered  from  other  electrons  in  the 
biological  material  determines  the  type  of  experiment.  The  long  range  order  in  the 
samples  used  for  X-ray  crystallography  and  fiber  diffraction  creates  interference 
which  leads  to  wide-angle  coherent  diffraction  (a  specialized  form  of  scattering), 
which  in  turn  leads  to  detailed  molecular  models.  Conversely,  in  small-angle  X-ray 
scattering  (SAXS)  the  random  orientation  of  the  molecules  in  the  sample  limits  the 
information  in  the  scattered  radiation  and  yields  structural  knowledge  at  the  level  of 
molecular  envelopes.  Each  of  these  experimental  approaches  has  a  unique  place  in 
structural  biology. 


4.1.2    X-Ray  Crystallography 

Macromolecular  crystallography  is  a  well  established  and  extensively  utilized  tech- 
nique with  a  long  history.  In  its  early  days  it  took  many  years  to  determine  a  single 
protein  structure;  today  this  can  be  accomplished  within  days  or  less  once  crystals 
have  been  obtained.  Indeed,  in  many  cases  X-ray  crystallographic  studies  are  now  a 
routine  component  of  biochemical  research.  From  a  practical  point  of  view  many 
years  of  training  or  apprenticeship  are  no  longer  necessary  to  participate  in  a  struc- 
tural determination.  There  are  numerous  introductory  texts  available  that  cover  all 
aspects  of  this  discipline  [3,  18,  35].  Furthermore,  there  are  outstanding  software 
packages  available  that  facilitate  structural  determinations  [1, 44]  and  graphical  rep- 
resentation of  the  results  [7,  29].  A  list  of  software  or  applications  that  are  useful  for 
structural  analysis  can  be  found  at  the  Protein  Data  Bank  [40].  Thus,  the  purpose  of 
this  brief  introduction  is  to  describe  the  general  limitations  of  this  technique  and 
what  can  easily  be  determined  or  accepted  from  an  X-crystallographic  study. 

The  work  flow  for  a  crystallographic  study  shown  in  Fig.  4.1  illustrates  an  impor- 
tant point.  A  crystallographic  study  starts  from  crystals.  That  is  obvious  from  the 
name  of  the  technique,  but  what  is  not  immediately  obvious  is  that  the  inherent 
quality  and  quantity  of  information  in  the  final  structural  model  are  dependent  on 
the  intrinsic  order  of  the  crystals.  This  arises  because  X-ray  crystallography  is  an 
imaging  technique,  where  the  experimental  data  forms  the  basis  for  creating  an 
image  of  the  distribution  of  electrons  in  the  crystal  lattice.  The  structural  model  is 
obtained  by  first  fitting  atoms  into  that  electron  density  followed  by  iterative  refine- 
ment against  the  original  X-ray  data.  Confidence  in  the  position  of  the  model  atoms 
is  increased  by  the  incorporation  of  stereochemical  restraints  during  the  refinement 
process,  but  fundamentally,  the  accuracy  of  the  model  is  directly  related  to  the  qual- 
ity of  the  crystals  (not  necessarily  the  size).  Thus,  there  is  a  lot  of  benefit  in  growing 
good  crystals.  For  this  reason,  protein  expression,  purification,  and  crystal  growth 
occupy  most  of  the  time  required  for  a  modern  structural  study. 
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Hen  Egg  White  Lysozyme  193L 


Fig.  4.1  Work  flow  for  an  X-ray  crystallographic  study.  The  ribbon  representation  of  lysozyme 
(PDB  accession  number  193L)  was  prepared  with  the  program  Pymol  [7] 


After  the  macromolecule  has  been  purified  and  crystallized,  the  next  step  is  data 
collection.  In  its  simplest  terms  this  involves  rotating  the  crystal  in  a  high  intensity 
X-ray  beam  while  recording  the  diffracted  radiation  with  a  detector.  In  2011  80  % 
of  structures  deposited  in  the  Protein  Data  Bank  were  determined  with  X-rays  gen- 
erated at  synchrotron  facilities  as  opposed  to  X-ray  sources  in  home  laboratories. 
There  are  primarily  two  reasons  for  the  increased  use  of  synchrotron  radiation  com- 
pared to  20  years  ago  when  virtually  all  structures  were  determined  with  conven- 
tional X-ray  sources.  First,  the  intensity  of  the  X-ray  source  is  many  orders  of 
magnitude  more  intense  than  can  be  achieved  with  in-house  X-ray  generators.  This 
increases  the  intensity  of  the  scattered  radiation  which  improves  the  signal  to  noise 
observed  in  the  data  collection  statistics.  Hence,  synchrotron  radiation  dramatically 
reduces  the  time  needed  to  record  a  data  set  (from  days  to  minutes)  and,  in  turn, 
allows  very  much  smaller  crystals  to  be  studied.  Second,  synchrotron  radiation  pro- 
vides a  relatively  straightforward  way  to  solve  the  "phase  problem"  in  crystallogra- 
phy discussed  later. 

A  macromolecular  crystal  acts  as  a  three-dimensional  diffraction  grating  and 
yields  a  complementary  three-dimensional  array  of  X-ray  diffraction  maxima  whose 
intensities  can  be  transformed  into  an  electron  density  distribution  by  a  Fourier 
transform  according  to  the  formulae  below 


P(x,y^)  =  JJJd\F(hkl)\t 


i(f)[hkl)  —i2n(hx+ky+lz) 

e 


(4.1) 


h     k  I 
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Table  4.1  Guide  to  the  interpretation  of  "resolution" 


Data/parameter 

Resolution  (A)    Structural  implication  ratioa 


>5 

Positions  of  oc-helices  mipht  be  correct  but  the  location  of 

0.15 

individual  atoms  is  subject  to  question 

3-5 

Protein  fold  mi&ht  be  correct  but  nositions  of  side  chains  are 

-1   1  Vtvlll    1  V/lVl    1111  will    L/V*'    Wl  A  VV  K.^     1.  /  l_  I  K.     L/VUlLlV/lllJ                 UlViV    V11W111U    Ul  V 

uncertain.  A  hydrogen  bonding  pattern  based  on  the 
electron  density  map  is  meaningless 

0.7- 

-0.15 

2.5- 

-3.0 

Fold  likely  correct,  but  surface  loops  might  be  ill-determined, 
which  might  lead  to  an  incorrect  model  if  the  protein  is 
not  homologous  to  a  known  structure.  Detailed  hydrogen 
bonding  pattern  can  be  questionable,  especially  for  side 
chains 

1  2- 

-0  7 

2.0- 

-2.5 

Acceptable  hydrogen  bonding  pattern.  Water  structure  and 
ligands  visible 

2.4- 

-1.2 

1.5- 

-2.0 

Well-defined  hydrogen  bonding  pattern  with  few  errors  in  the 
side  chain  conformations 

5.6- 

-2.4 

0.7- 

-1.5 

Atoms  become  resolved.  Many  multiple  conformations 
become  visible 

5.5- 

-5.6 

Theoretical  value  based  on  50  %  solvent  in  the  crystalline  lattice  assuming  four  parameters  per 
atom  (x,y,z  and  a  temperature  factor) 


where  p(x ,y,z)  is  the  electron  density  at  a  position  x,y,z  in  the  crystals  and  \F(hkl)\  is 
a  structure  factor  amplitude  derived  from  the  X-ray  diffraction  intensities.  @(hkl)  is 
the  phase  of  the  diffracted  beam  relative  to  the  incoming  radiation,  which  is  lost 
when  the  data  is  recorded.  This  is  unfortunate,  because  the  phase  term  contributes 
strongly  to  resulting  electron  density  maps.  The  loss  of  the  phase  information  is 
known  as  the  "phase  problem"  in  X-ray  crystallography.  This  simple  relationship 
shows  that  all  of  the  X-ray  data  contributes  to  the  electron  density  at  every  point  in 
the  crystal  lattice.  The  level  of  detail  in  the  electron  density  depends  on  the  extent 
of  the  X-ray  diffraction,  where  this  is  defined  in  terms  of  the  "resolution"  of  the  data 
and  is  normally  defined  with  units  of  A.  In  simple  terms,  details  finer  than  0.6 lx 
"resolution"  cannot  be  resolved  within  an  electron  density  map  [14].  Strictly  speak- 
ing the  resolution  of  the  data  is  related  to  the  maximum  angle  of  recorded  diffraction 
according  to  the  expression 

1  _2sin0 
resolution  A 

where  20  is  the  scattering  angle  and  X  is  the  wavelength  of  the  X-rays.  The  reported 
resolution  provides  a  hint  of  what  might  be  expected  from  an  X-ray  structural  deter- 
mination as  summarized  in  Table  4.1.  Thus,  if  the  purpose  of  the  structure  is  to 
understand  ligand  interactions,  a  resolution  of  at  least  2.5  A  is  required,  whereas  if 
the  fold  alone  will  suffice  to  answer  the  biological  question,  3  A  resolution  might  be 
adequate.  A  resolution  better  than  2.0  A  is  needed  before  any  significant  atomicity 
can  be  observed.  The  term  "atomic  resolution"  is  best  reserved  for  structures  with  a 
resolution  of  better  than  1  A. 
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The  resolution  defines  the  amount  of  detail  that  can  in  principle  be  derived  from 
the  experimental  data,  but  there  are  two  other  components  that  influence  the  reli- 
ability of  the  resultant  structure.  These  are  the  quality  of  the  X-ray  data  itself,  and 
the  thoroughness  of  the  structural  refinement.  Both  of  these  are  typically  judged  in 
terms  of  7?factors.  The  quality  of  the  X-ray  data  is  typically  defined  by 

|^ r hkl      ^ hkl 

merge  or  sym  j 

hkl  J-hkl 

where  RmeTge  describes  how  well  multiple  measurements  of  the  same  or  symmetry- 
related  X-ray  diffraction  intensity  (Im)  data  merge  together.  This  typically  has  a 
value  of  less  than  0. 1  for  high  quality  data,  but  is  often  higher  for  weakly  diffracting 
crystals.  The  quality  of  the  final  refined  structure  is  often  judged  by  three  other 
numbers,  7?factor,  7?work,  and  7?free,  which  take  the  form 


R 


factor,  work,  or  free 


=  1 


hkl 


\F  -F 

|    o  c 

F 


^factor,  ^work,  and  7?free  are  used  to  assess  the  quality  of  the  final  model  as  measured 
by  the  similarity  between  the  observed  data  and  that  calculated  from  the  model. 
Essentially  all  of  the  models  in  the  Protein  Data  Bank  have  been  refined  against  the 
observed  data. 

What  can  be  expected  from  a  model  at  a  given  resolution  is  described  in  Table  4.1. 
From  the  point  of  view  of  refinement  this  table  presents  an  interesting  problem 
because,  except  for  very  high  resolution  structures,  there  is  insufficient  data  to  refine 
a  model  against  the  X-ray  data  alone.  Examination  of  the  X-ray  diffraction  pattern 
shown  in  Fig.  4.1  suggests  that  X-ray  crystallography  can  provide  a  very  large  num- 
ber of  independent  data  points  since  each  spot  or  reflection  on  the  image  represents 
a  unique  experimental  measurement.  Ironically,  the  number  of  experimental  obser- 
vations is  usually  far  less  than  the  number  of  parameters  required  to  independently 
describe  the  position  of  atoms  in  a  protein.  The  minimal  number  is  four  parameters 
per  atom,  where  these  consist  of  three  positional  parameters  (x,  y,  and  z)  and  one 
thermal  parameter  to  describe  the  movement  of  the  atom  in  the  crystalline  lattice. 
The  ratio  between  the  data  and  number  of  parameters  must  be  substantially  greater 
than  one  to  allow  independent  refinement  of  the  model  against  that  data. 

The  problem  of  insufficient  data  is  overcome  by  including  stereochemical 
restraints  such  as  bond  distances,  conformational  angles,  and  nonbonding  contacts. 
The  shortage  of  data  means  that  it  is  possible  to  over-fit  a  model  to  the  data.  A  mea- 
sure of  over- fitting  is  given  by  7?free  [4].  This  is  the  7?factor  for  5-10  %  of  the  data  that 
has  been  excluded  at  random  from  the  refinement  calculations.  For  example,  a  test 
case  in  which  a  cellular  retinoic  acid-binding  protein  type  II  was  deliberately  built 
backwards  into  the  electron  density  maps  yielded  an  7?work  of  0.21  with  good  geome- 
try, but  an  7?free  of  0.62,  which  exceeds  the  value  anticipated  for  a  random  structure 
[15].  Although  this  is  an  extreme  case,  it  illustrates  that  large  disparities  between  the 
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Fig.  4.2  Ramachandran  plot  for  hen  egg  white  lysozyme  (PDB  accession  number:  193L).  The 
conformational  angles  for  non-glycine  residues  (black  squares)  in  well-defined  X-ray  structures 
should  reside  within  or  close  to  the  fully  allowed  regions  of  the  Ramachandran  plot  as  indicated  by 
the  arrows 

7?work  and  7?free  indicate  that  something  is  amiss  with  the  model  or  experimental  data. 
In  practice  7?free  is  2-5  %  higher  than  7?work  for  a  satisfactory  model.  Greater  differ- 
ences indicate  that  something  is  amiss  with  the  model  or  experimental  data. 

There  is  considerable  difference  of  opinion  in  what  a  satisfactory  7?work  should  be 
for  a  satisfactory  model.  In  general,  models  derived  from  high  quality  data 
(Emerge  <  0.08)  should  have  an  7?work  of  considerably  less  than  0.20.  Lesser  quality 
data  will  generally  yield  a  model  with  an  7?work  of  0.2-0.24.  In  general,  Rwork  should 
be  less  than  0.25.  In  those  cases  where  a  high  7?factor  is  reported  for  a  structure  accom- 
panied by  high  quality  data  it  can  be  assumed  that  either  the  refinement  is  incom- 
plete or  there  is  something  wrong  with  the  analysis.  The  Protein  Data  Bank  may 
also  list  an  overall  refinement  7?factor  where  this  is  the  statistic  for  the  final  round  of 
refinement  in  which  both  the  working  set  and  test  data  set  combined.  It  is  difficult, 
and  perhaps  incorrect,  to  assess  the  correctness  of  a  model  based  on  a  single  global 
parameter.  Consideration  of  the  stereochemistry  and  quality  of  the  localized  elec- 
tron density  will  almost  always  lead  to  a  better  assessment  of  the  true  information 
content  of  the  model. 

Comparison  of  the  stereochemistry  of  the  model  with  standard  parameters  is  an 
important  measure  of  the  quality  of  the  coordinates.  This  is  usually  listed  as  devia- 
tions of  bond  distances  and  angles  from  the  normal  values,  but  is  easiest  to  see  for 
proteins  on  a  Ramachandran  plot.  This  displays  the  distribution  of  backbone  confor- 
mational angles  (\|/  and  <p)  on  a  background  that  indicates  the  conformationally 
allowed  regions.  A  good  structure  shows  a  tight  distribution  of  angles  falling  within 
the  conformationally  allowed  space  adopted  by  a-helices,  p- strands,  turns,  and 
random-coil  (Fig.  4.2).  All  of  the  information  necessary  to  assess  the  geometry  of 
an  X-ray  structure  is  available  from  the  Protein  Data  Bank  [40,  45].  This  database 
contains  a  wealth  of  statistical  information  and  a  large  array  of  tools  and  links  that 
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help  in  evaluating  the  quality  of  a  structural  model.  Numerous  other  servers  exist  of 
which  the  MolProbity  and  PDBsum  servers  are  particularly  useful  sources  for  struc- 
tural validation  and  provide  scoring  functions  that  help  the  user  assess  the  quality  of 
models  [6,  21,  28]. 

Statistics  are  certainly  useful  for  validating  macromolecular  structures,  but  visual 
tools  are  especially  compelling  since  structural  studies  lead  to  visual  representa- 
tions. At  the  most  basic  level  examination  of  the  electron  density  superimposed  on 
the  model  is  one  of  the  best  methods  for  assessing  the  validity  of  a  model.  For  this 
reason,  many  publishers  still  demand  a  figure  that  shows  electron  density  for  a  sec- 
tion of  the  model  or  for  a  ligand.  This  should  always  be  presented  from  an  "omit 
map"  in  which  the  region  of  the  molecule  shown  was  omitted  from  the  refinement 
and  phase  calculation.  This  is  necessary  in  order  to  remove  model  bias  and  is  par- 
ticularly important  for  verification  of  the  correct  modeling  for  a  ligand.  As  a  general 
rule  if  the  electron  density  does  not  cover  or  fit  the  atoms  under  consideration  the 
location  of  those  atoms  is  not  supported  by  the  experimental  data.  Real-space  cor- 
relation functions  are  also  a  useful  method  of  mathematically  defining  how  well  a 
model  matches  the  electron  density  and  can  identify  unreliable  regions  of  a  model. 

To  summarize,  it  is  important  to  consider  the  limitations  of  a  crystallographic 
model  in  the  context  of  the  statistical  descriptors  that  assess  quality,  not  only  at  a 
global  level,  but  also  at  the  level  of  individual  atoms  in  the  macromolecule.  Visual 
representations  provided  by  graphical  applications  such  as  Pymol  [7]  do  not  neces- 
sarily provide  any  indication  of  quality. 

The  manner  in  which  the  phase  problem  was  solved  can  also  influence  an  assess- 
ment of  the  correctness  of  a  structure.  As  noted  earlier,  X-ray  diffraction  data  is  the 
result  of  constructive  interference  of  radiation  scattered  from  a  three-dimensional 
array  of  atoms.  The  scattered  radiation  has  both  an  amplitude  and  a  phase.  The 
amplitudes  can  be  measured,  but  the  phase  information  is  lost;  however,  the  latter  is 
necessary  in  order  to  reconstruct  an  image  of  the  electron  density  from  the  ampli- 
tudes (4.1).  There  are  two  fundamentally  different  ways  that  phases  are  obtained. 
Either  experimental  phases  are  obtained  or  phases  are  estimated  from  a  preexisting 
structure.  Experimental  phases  are  obtained  by  perturbing  the  diffraction  through 
the  inclusion  of  heavier  atoms  in  the  crystal  lattice.  Determination  of  the  location  of 
the  heavier  atoms  leads  to  a  structural  solution  for  the  entire  macromolecule.  This  is 
the  basis  for  multiple  isomorphous  replacement  and  anomalous  scattering  tech- 
niques. Multiple  isomorphous  replacement  with  heavy  metals  was  the  first  method 
used  to  determine  X-ray  structures.  Today,  most  experimental  phases  are  deter- 
mined with  anomalous  scattering  techniques  that  utilize  synchrotron  radiation  to 
determine  phases  from  the  location  of  elements  such  as  selenium  in  selenomethionine- 
substituted  proteins  which  replaces  the  sulfur  of  methionine.  Experimental  phasing 
yields  an  unbiased  structural  determination  and  is  the  most  powerful  way  for  obtain- 
ing new  structures,  but  it  does  require  crystals  that  contain  a  heavy  atom  substitution 
or  one  that  absorbs  X-rays  at  an  accessible  wavelength. 

The  alternative  way  to  solve  the  phase  problem  is  to  use  the  structure  of  a  related 
macromolecule  to  derive  phases.  This  is  known  as  "molecular  replacement"  and 
has  been  used  to  determine  more  than  half  the  structures  in  the  Protein  Data  Bank. 
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With  this  approach  care  must  be  taken  to  remove  the  "model  bias"  introduced  from 
the  search  structure  [9],  which  is  most  easily  achieved  for  high  resolution  and  high 
quality  data.  The  normal  metrics  for  structural  quality  described  above  should  pro- 
vide guidance  on  whether  a  molecular  replacement  structure  is  meaningful,  but 
examination  of  the  electron  density  for  the  places  that  differ  in  the  target  and  search 
model  is  often  the  best  indication  that  model  bias  has  been  removed. 

As  of  summer  2012  there  were  over  80,000  X-ray  structures  in  the  Protein  Data 
Bank.  Although  the  majority  of  the  coordinates  describe  proteins,  X-ray  crystallog- 
raphy is  also  useful  to  determine  the  structure  of  DNA,  RNA,  and  carbohydrates,  as 
well  as  hetero-polymeric  complexes  comprising,  for  example,  protein  and  RNA. 
Thus,  it  is  appropriate  to  ask  whether  there  are  any  inherent  limitations  to  X-ray 
crystallography.  The  obvious  limitation  is  that  the  material  must  form  a  well-ordered 
crystalline  lattice.  The  question  is  whether  the  need  for  a  crystal  lattice  limits  the 
problem  that  can  be  investigated  or  the  information  that  can  be  derived. 

Over  the  years  there  has  been  an  extensive  discussion  over  whether  the  confor- 
mation of  a  macromolecule  captured  in  a  crystal  lattice  reflects  the  conformation  in 
solution.  In  many  cases  enzymes  are  active  in  the  crystal  lattice  so  that  it  is  generally 
accepted  that  the  structure  of  domains  seen  in  a  crystal  reflects  the  structure  in  solu- 
tion. Clearly,  the  conformation  of  loops  involved  in  crystal  contacts  may  not  repre- 
sent those  favored  in  solution,  but  a  greater  issue  surrounds  macromolecules  that 
contain  multiple  domains  whose  relationships  with  each  other  influence  biological 
function.  In  this  case,  it  is  quite  likely  that  the  arrangement  of  domains  seen  in  a 
crystal  lattice  could  be  different  from  the  "active"  state.  The  solution  to  this  problem 
is  typically  gained  from  determining  the  structures  of  the  same  macromolecule,  but 
in  different  states  and  under  different  conditions,  and  perhaps  complexed  with 
ligands  or  interacting  partners. 

A  fundamental  limitation  to  almost  all  X-ray  structures  is  that  they  represent  a 
time-averaged  view  of  the  contents  of  the  crystal.  Consequently,  many  of  the 
dynamic  properties  of  individual  macromolecules  are  lost.  The  exception  to  this 
general  rule  is  time-resolved  X-ray  crystallography  that  uses  exceedingly  short 
exposures  (nanoseconds  or  less)  to  follow  coordinated  reactions  within  a  crystal 
lattice  [32].  There  is  some  information  about  movement  embedded  within  the  tem- 
perature factors  associated  with  every  atom,  but  in  an  absolute  sense  this  informa- 
tion is  inaccurate  because  temperature  factors  accommodate  other  crystallographic 
issues  such  as  lattice  disorder,  radiation  damage,  absorption  effects,  and  errors  in 
modeling. 

The  sizes  of  molecules  that  have  been  crystallized  have  steadfastly  increased 
since  the  first  structures  of  myoglobin  and  hemoglobin.  The  structures  of  the  80S 
ribosome  represent  the  largest  macromolecular  assemblies  whose  structures  have 
been  determined  by  X-ray  crystallography  with  molecular  weights  of  over  four  mil- 
lion [2],  but  doubtless  the  structures  of  larger  molecules  will  be  determined.  Thus, 
size  is  not  necessarily  a  limitation,  though  in  general,  as  the  contents  of  the  crystal 
lattice  grow  larger  the  angular  extent  or  resolution  of  the  X-ray  data  is  lower. 

At  the  end  of  the  day  it  can  be  argued  that  any  structural  information  about  a 
macromolecule  is  better  than  none.  An  X-ray  crystallographic  structure  does  not 
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answer  all  of  the  questions  and  it  certainly  does  not  establish  biological  relevance. 
It  does  however  provide  a  molecular  framework  for  understanding  the  relationship 
between  sequence  and  function.  Consideration  of  the  resolution  and  quality  of  the 
data  within  the  context  of  how  well  the  structure  fits  into  the  biological  problem  will 
allow  a  realistic  appreciation  of  the  true  information  content  of  a  structural  model. 
An  X-ray  structure  should  complement  existing  biological  data.  A  structure  might 
lead  to  new  interpretations,  but  these  hypotheses  must  mirror  fundamental  princi- 
ples of  chemistry,  biochemistry,  and  physics.  The  major  errors  that  have  occurred  in 
X-ray  crystallography  biology  have  all  deviated  from  these  principles.  An  X-ray 
structure  should  be  judged  by  these  criteria  rather  than  the  aesthetic  quality  of  the 
visual  representations  of  the  structures  generated  with  programs  such  as  Pymol  or 
Chimera  [7,  29]. 


4.1.3    Diffraction  from  Noncrystalline  Materials 

X-ray  scattering  and  diffraction  from  noncrystalline  materials  is  a  powerful  source 
of  structural  knowledge  that  complements  that  available  from  X-ray  crystallogra- 
phy. It  allows  the  study  of  fibrous  materials  that  are  not  amenable  to  crystallization, 
and  can  provide  information  about  assemblies  in  solution  (small-angle  and  wide- 
angle  scattering).  These  techniques  have  become  much  more  accessible  with  the 
development  of  high  intensity  synchrotron  facilities  and  improved  detectors  [42] . 
Synchrotron  sources  are  generally  preferred  since  the  scattered  X-ray  radiation 
from  noncrystalline  materials  is  usually  very  weak.  From  the  perspective  of  this 
brief  introduction  the  question  that  must  be  asked  is:  what  can  be  gained  or  expected 
from  application  of  these  techniques?  The  unique  contributions  of  fiber  diffraction 
and  small-angle  scattering  to  the  study  of  biological  materials  are  described  below. 


4.1.4    Fiber  Diffraction 

Many  fibrous  biological  materials  are  composed  of  helical  biopolymers.  These  are 
not  amenable  to  crystallization,  but  can  still  adopt  ordered  arrays  due  to  their  ten- 
dency to  line  up  in  parallel  arrays.  These  fibers  can  provide  important  structural 
information  by  X-ray  diffraction,  even  though  they  are  rotationally  disordered 
around  the  fiber  axis.  Indeed,  fiber  diffraction  has  played  a  pivotal  role  in  the  devel- 
opment of  structural  biology.  The  early  studies  of  keratin  (wool)  (Fig.  4.4a)  by 
William  Asbury  provided  the  vital  information  necessary  for  the  definition  of  the 
a-helix  by  Pauling,  Corey,  and  Branson  [27].  Likewise,  fiber  diffraction  patterns 
from  DNA  by  Rosalind  Franklin  (Fig.  4.4b)  as  interpreted  by  Crick  and  Watson 
played  a  central  role  in  the  discovery  of  the  genetic  code. 

In  principle,  the  experimental  arrangement  for  fiber  diffraction  is  quite  simple 
(Fig.  4.3),  because  a  few  orientations  of  the  fiber  should  be  sufficient  to  yield  the 
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X-rays 

Fig.  4.3  Experimental  arrangement  for  fiber  diffraction 


Fig.  4.4  X-ray  fiber  diffraction  patterns  from  (a)  a-keratin  from  a  Crested  African  Porcupine 
(Hystrix  cristata)  (courtesy  of  Bruce  Frasier  and  David  Parry),  (b)  B-DNA  [11],  and  (c)  tobacco 
mosaic  virus  (Gerald  Stubbs,  Vanderbilt  University,  [25]) 

diffraction  data.  In  practice,  modern  instrumentation  is  highly  sophisticated  because 
the  diffraction  is  weak  and  the  samples  are  often  very  small.  The  use  of  synchrotron 
radiation  coupled  with  high  resolution  CCD-based  detectors  or,  more  recently, 
photon-counting  devices  has  allowed  the  study  of  wide  range  of  biopolymers, 
including  simple  polypeptides,  polynucleotides,  cytoskeletal  filaments  (actin  and 
myosin),  filamentous  viruses,  and  larger  biological  assemblies  such  as  muscle 
fibers.  Synchrotron  radiation  now  provides  X-ray  beams  that  are  just  a  few  microns 
in  diameter,  which  permits  direct  examination  of  biological  materials. 

The  information  content  of  the  diffraction  pattern  depends  on  the  longitudinal 
order  in  the  polymeric  material  and  relative  orientation  of  the  fibers  within  the  sam- 
ple (Fig.  4.4).  In  the  initial  studies  of  wool  which  is  built  from  intermediate  filament 
proteins,  the  diffraction  pattern  revealed  only  diffuse  diffraction  maxima  along  the 
equator  related  to  the  packing  of  the  intermediate  filaments  and  a  strong  diffraction 
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maximum  along  the  meridian  at  a  spacing  of  5.1  A.  The  peak  at  5.1  A  spacing  led 
to  some  confusion  in  the  development  of  the  model  for  the  a-helix  since  the  helical 
repeat  of  a  standard  a-helix  is  5.4  A.  (The  history  of  the  a-helix  has  been  elegantly 
summarized  by  David  Eisenberg  [10].)  This  discrepancy  was  resolved  by  Francis 
Crick  who  proposed  that  a-keratin  contains  dimeric  coiled-coils  in  which  the 
a-helices  are  inclined  at  an  angle  of  20°  to  the  coiled-coil  axis  so  that  the  apparent 
repeat  along  the  fiber  axis  is  reduced  compared  to  that  of  a  canonical  a-helix.  This 
is  a  good  example  of  the  importance  of  starting  from  the  correct  structural  model 
when  interpreting  limited  amounts  of  experimental  data. 

In  most  fiber  diffraction  studies  there  is  insufficient  experimental  data  to  deter- 
mine the  three-dimensional  structure  of  the  protein  or  macromolecular  assembly 
from  first  principles.  Thus,  prior  structural  knowledge  is  usually  needed  in  order  to 
develop  a  model  for  a  filamentous  macromolecular  assembly.  Typically,  an  existing 
structure  for  a  subunit  is  fitted  to  the  diffraction  data  to  provide  the  overall  orientation 
of  the  molecule  within  the  fiber.  The  difficulty  arises  when  the  molecule  undergoes  a 
conformational  change  when  it  polymerizes  to  become  a  filament.  This  problem  can 
be  overcome  by  molecular  dynamics  and  energy  minimization;  however,  it  is  not 
easy  for  the  nonspecialist  reader  to  appreciate  the  uniqueness  of  the  structural  solu- 
tion or  the  coordinate  error  for  the  model.  In  the  first  instance,  the  resultant  models 
should  show  tight  distributions  of  conformational  angles  as  assessed  by  a 
Ramachandran  plot  (Fig.  4.2)  and  the  intermolecular  interactions  within  the  filament 
must  accommodate  the  hydrogen  bonding  characteristics  of  the  main-chain  atoms 
and  side  chains.  Furthermore,  hydrophobic  residues  that  are  normally  found  in  the 
buried  intermolecular  interfaces  should  be  in  close  proximity  to  residues  of  the  same 
character.  Despite  its  limitations,  fiber  diffraction  provides  access  to  unique  struc- 
tural information,  which  is  particularly  evident  in  studies  of  amyloid  proteins. 

Amyloid  fibrils  have  been  implicated  in  numerous  pathological  conditions  includ- 
ing, Alzheimer's  disease  and  prion  infections  such  as  Creutzfeldt-Jakob  disease. 
These  fibrils  occur  when  a  protein  that  is  normally  soluble  folds  aberrantly  to  become 
an  insoluble  aggregate  that  exhibits  cross-P  structure  [26,  37,  41].  These  protein 
aggregates  have  proven  difficult  to  study  because  the  proteins  are  completely  insolu- 
ble. The  term  "cross-P"  arises  from  the  characteristic  diffraction  pattern.  This 
includes  a  strong  meridional  intensity  at  around  -4.7  A  coupled  with  a  weaker  equa- 
torial intensity  at  -10  A.  The  intense  reflection  at  -4.7  A  (cross-P  diffraction)  arises 
from  the  mean  separation  of  hydrogen-bonded  p- strands  that  lie  perpendicular  to  the 
fiber  axis  and  form  sheets  that  lie  parallel  to  the  axis.  The  equatorial  intensity  reflects 
the  packing  of  the  p-sheets  parallel  to  the  fibril  axis.  Detailed  examination  of  the  dif- 
fraction pattern  reveals  that  many  fibrils  exhibit  a  similar  structure  even  though  there 
is  often  negligible  sequence  similarity  between  sections  that  form  amyloids  [37]. 
Fiber  diffraction  places  constraints  on  the  organization  of  the  amyloids  in  the  fiber 
that  is  not  available  by  other  means.  For  example,  a  fiber  diffraction  study  of  infec- 
tious natural  prions  and  recombinant  amyloids  showed  that  these  assemblies  are  not 
alike  at  low  resolution,  even  though  they  both  show  the  characteristic  cross-p  meridi- 
onal diffraction  at  4.75  A  (Fig.  4.5).  This  has  important  implications  for  understand- 
ing the  structural  properties  of  infectious  prion  proteins. 
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Fig.  4.5  Comparison  of  observed  and  calculated  diffraction  patterns  for  natural  Syrian  hamster 
and  recombinant  prion  proteins  which  can  be  modeled  by  P-helical  and  stacked-sheet  amyloid 
models,  respectively,  (a,  b)  Experimental  and  calculated  diffraction  patterns  from  natural  Syrian 
hamster  prion  protein  (SHaPrP  27-30).  (c)  Disordered  noncrystalline  trimeric  P-helical  model  used 
to  calculate  data  in  (b).  (d,  e)  Experimental  and  calculated  diffraction  patterns  from  a  recombinant 
Syrian  hamster  amyloid,  (f)  Stacked-sheet  model  used  to  calculate  data  in  (e).  In  both  models,  the 
filament  axis  is  perpendicular  to  the  figure  plane.  Reproduced  with  permission  from  [43] 

In  some  cases,  such  as  ordered  gels  of  filamentous  viruses,  the  information  con- 
tent derived  from  fiber  diffraction  is  very  high  (Fig.  4.4c)  and  permits  an  ab  initio 
structural  determination  [36].  The  first  example  of  this  was  the  structural  determina- 
tion of  tobacco  mosaic  virus  [25].  Independent  phasing  by  the  use  of  heavy  atom 
derivatives  coupled  with  molecular  dynamics  refinement  yielded  not  only  the  inter- 
actions between  subunits  in  the  helical  filament,  but  also  the  path  and  binding  deter- 
minants for  the  enclosed  RNA  (Fig.  4.6).  Although  a  partial  structure  of  the 
individual  protein  subunit  was  known  previously  from  the  structure  of  a  symmetric 
disk,  this  information  was  not  required  to  determine  the  structure  of  the  virus  from 
the  fiber  diffraction  data. 

Highly  ordered  gels  have  also  been  obtained  for  actin  and  bacterial  flagella. 
These  share  with  filamentous  viruses  the  property  that  they  are  built  from  many 
identical  subunits  that  are  organized  in  a  helical  manner  within  the  filament.  The 
key  to  all  of  these  types  of  study  is  obtaining  a  well-ordered  sample.  The  use  of 
magnetic  fields  of  the  order  of  10  T  has  proven  to  be  increasingly  important  for 
obtaining  ordered  fibers  [36,  46]. 
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Fig.  4.6  Structure  of  tobacco  mosaic  virus  determined  by  fiber  diffraction  [25]  (PDB  accession 
number  2TMV).  In  this  instance,  the  fiber  diffraction  pattern  (Fig.  4.4c)  yielded  sufficient  informa- 
tion to  not  only  trace  the  polypeptide  chain,  but  also  define  the  path  of  the  RNA  for  this  viral 
assembly 

X-ray  fiber  diffraction  has  played  a  central  role  in  understanding  the  molecular 
transitions  during  muscle  contraction.  X-ray  diffraction  is  the  only  technique  that 
can  provide  molecular  structural  information  from  muscle  in  its  physiological 
hydrated  state.  Indeed,  this  technique  can  follow  the  changes  that  occur  during  mus- 
cle contraction  [30,  34].  As  might  be  expected  from  the  complexity  of  muscle,  most 
of  the  diffraction  is  limited  to  low  resolution  and  is  very  weak.  This  demands  high 
intensity  X-ray  sources  that  are  only  available  at  synchrotron  facilities  and  very 
small  X-ray  beams  that  can  match  the  size  of  a  single  muscle  fiber.  X-ray  micro- 
beams  have  permitted  studies  of  muscle  contraction  from  living  Drosophila  during 
tethered  flight  [8]. 


4.2    Small-Angle  Scattering 

Constructive  interference  of  scattered  X-rays  from  ordered  or  partially  ordered 
macromolecules  is  the  basis  of  X-ray  crystallography  and  fiber  diffraction.  This 
yields  an  enormous  amplification  of  the  signal  compared  to  the  scattering  from  an 
ensemble  of  molecules  that  lack  any  positional  order.  Even  so,  scattering  from  solu- 
tions of  molecules  in  solution  can  provide  unique  structural  information  about  the 
size,  shape,  and  oligomerization  state  of  a  macromolecule.  This  is  the  rationale 
behind  small-angle  scattering  measurements.  Excellent  reviews  of  the  fundamentals 
of  small-angle  and  wide-angle  scattering  and  application  to  macromolecules  are 
found  in  ref.  [16]  and  in  refs.  [31,  42]  respectively. 

Both  X-ray  and  neutron  low-angle  scattering  provide  information  at  10-20  A 
resolution.  This  level  of  detail  encompasses  the  shape  and  oligomerization  state 
of  the  macromolecule.  It  has  the  benefit  that  it  encompasses  the  molecular  size 
range  that  lies  between  that  readily  accessible  by  NMR  and  electron  microscopy. 
Furthermore,  the  experimental  requirements  are  quite  simple.  A  dilute  sample  of 
the  macromolecule  (low  micromolar  concentrations)  is  exposed  to  a  collimated  or 
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focused  beam  of  radiation.  The  key  requirements  are  that  the  macromolecule 
must  be  monodisperse  and  stable  in  solution.  The  stability  is  less  important  for 
experiments  performed  with  synchrotron  radiation  since  these  can  be  performed 
in  seconds,  but  is  much  more  of  a  concern  for  data  collected  with  conventional 
X-ray  sources. 

Solution  X-ray  scattering  from  macromolecules  in  solution  is  exceedingly  weak 
compared  to  the  scattering  from  the  solvent.  At  low  resolution  the  signal  is  derived 
from  the  contrast  between  the  solvent  and  protein.  For  pure  water,  the  average  electron 
density  is  -0.33  electrons/A3,  whereas  proteins  are  typically  around  0.44  electrons/A3. 
Obviously,  this  is  higher  for  RNA  and  DNA.  The  signal  is  obtained  by  subtracting  the 
scattering  of  a  buffer  sample  from  that  of  the  biological  sample.  This  must  be  done 
very  precisely.  The  resultant  scattering  curve,  I(q),  is  radially  symmetric  because  the 
molecules  are  randomly  oriented  in  solution.  I(q)  is  the  intensity  of  the  scattering  as  a 
function  of  scattering  angle,  where  q  =  (47rsmQ)/X.  q  is  known  as  the  momentum 
transfer  vector. 

The  equation  that  describes  the  scattering  from  the  electron  density  of  a  homo- 
geneous sample  is  given  by: 

I(q)  =  4n  \  P(r)  ^dr 

o  <lr 

where  P(r)  is  the  pair  density  distribution  function  (PDDF)  and  Dmax  is  the  maxi- 
mum distance  present  in  the  sample.  One  of  the  simplest  items  that  can  be  derived 
from  small-angle  scattering  is  the  radius  of  gyration  (RG).  At  low  angles  the  scattering 
can  be  described  by  the  Guinier  approximation: 

/(<7)  =  /(0)exp(<7%/3) 

This  is  typically  plotted  as  ln[I(q)]  =  ln[I(0)-q2(RG2/3)]  (Fig.  4.7b),  where  the 
slope  will  yield  the  radius  of  gyration,  and  the  extrapolation  to  zero  scattering  angle 
can  provide  the  molecular  weight  of  the  species  in  solution.  For  most  globular  bio- 
logical macromolecules,  this  analysis  should  only  be  performed  in  the  region  closest 
to  the  beam  stop  or  center  of  the  scattering  pattern  where  the  product  qRG  is  less  than 
1.3,  although  the  range  is  considerably  smaller  for  asymmetric  molecules.  This  type 
of  analysis  can  be  performed  very  rapidly  and  has  been  utilized  for  well  over  55  years 
[12].  It  provides  valuable  first  sight  into  the  oligomerization  state  of  the  molecule, 
but  the  complete  scattering  pattern  contains  considerably  more  information. 

The  small-angle  scattering  typically  extends  well  beyond  the  range  of  the  Guinier 
approximation,  where  this  part  of  the  profile  is  controlled  by  the  shape  of  the  molec- 
ular envelope  (Fig.  4.7a).  The  molecular  envelope  controls  the  profile  because  the 
scattering  signal  is  derived  from  the  difference  between  the  average  electron  density 
of  the  solute  molecules  and  the  bulk  solvent.  The  variation  in  electron  density  within 
a  macromolecule  is  lost  at  low  resolution.  Thus,  SAXS  is  ideal  for  identifying  or 
characterizing  unfolded  proteins.  This  is  often  evaluated  from  a  Kratky  plot  (Fig.  4.8) 


4 


Diffraction  and  Scattering  by  X-Rays  and  Neutrons 


105 


0  0.5         1.0         1.5        2.0  0  0.5  1.0 

<7(A'1)  g^x  10"3  (A  2) 


Fig.  4.7  Small-angle  X-ray  scattering  (SAXS)  data,  (a)  X-ray  scattering  intensity  log(7)  vs.  scat- 
tering angle  (q).  Arrows  point  to  regions  of  the  plot  that  correspond  to  structural  information, 
(b)  A  Guinier  plot,  ln(7)  vs.  q2,  provides  information  about  the  size  of  the  molecule  and  the  quality 
of  the  sample.  For  a  globular  particle,  this  plot  should  be  linear  at  small  values  of  q  (gmax*^G<  1.3). 
A  sharp  drop-off  in  the  plot  is  indicative  of  interparticle  repulsion  while  an  upward  curve  is  indica- 
tive of  aggregation.  Dashed  line  indicates  data  extrapolation,  based  on  the  Guinier  equation,  to 
q  =  0  (arrow).  Reprinted  with  permission  from  [22] 


Fig.  4.8  The  Kratky  plot, 
q2I  vs.  q,  indicates  the  extent 
of  folding  within  a 
macromolecule.  Molecules 
with  an  extensive  tertiary  fold 
(collapsed)  result  in  a 
different  profile  than 
non-globular  molecules 
(extended)  or  unfolded 
molecules 


0.3- 


0.2- 


CM 
-J 


0.1- 


Kratky  Plot 


—  Collapsed 
■—  Extended 
--  Unfolded 


0.25 


in  which  q2I  is  plotted  against  q.  Globular  proteins  typically  show  a  parabola-shaped 
plot,  whereas  unstructured  proteins  do  not  exhibit  a  peak  and  are  approximately 
linear  with  respect  to  q  at  high  values  of  q. 

Careful  analysis  of  the  scattering  profile  can  provide  unique  information  about 
the  shape  of  the  molecular  envelope.  This  is  often  analyzed  in  terms  of  the  pair  dis- 
tribution function  P(r)  (also  known  as  the  PDDF),  which  describes  the  spherically 
averaged  distribution  of  intermolecular  vectors  within  the  molecular  envelope  and 
is  somewhat  analogous  to  the  crystallographic  Patterson  function,  a  Fourier  calcu- 
lated with  interatomic  vectors.  The  nature  of  the  PDDF  varies  considerably 
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Fig.  4.10  Combination  of  SAXS  with  NMR  in  the  structural  determination  of  RNA.  (a)  The  ab 
initio  structure  of  U2/U6  spliceome  RNA  determined  by  SAXS.  (b)  The  fit  of  the  ten  lowest  energy 
models  determined  from  the  NMR  to  the  SAXS  envelope,  (c)  The  lowest  energy  model  of  that 
ensemble.  Modified  with  permission  from  [5].  The  SAXS  data  was  used  both  to  select  the  best 
models  from  the  ensemble  of  NMR  coordinates  and  as  a  restraint  in  the  final  model  refinement 


depending  on  the  overall  shape  of  the  molecule  (Fig.  4.9),  and  can  be  calculated 
through  a  Fourier  transform  of  the  scattering  curve.  As  important  is  the  fact  that  a 
theoretical  scattering  curve  can  be  calculated  by  a  Fourier  transform  of  a  P(r)  com- 
puted from  a  trial  model  and  compared  with  the  experimental  function. 

Construction  and  validation  of  low-resolution  three-dimensional  models  from 
SAXS  data  is  the  focus  of  most  current  applications  of  SAXS  as  reviewed  in  [20] . 
Most  ab  initio  approaches  create  a  simplified  model  for  the  macromolecular  enve- 
lope based  on  dummy  atoms  or  amino  acid  residues  (Fig.  4.10a).  This  provides 
insight  into  the  molecular  shape,  but  the  number  of  statistically  independent  data 
points  for  most  SAXS  curves  is  quite  limited.  Consequently,  the  most  successful 
applications  of  SAXS  utilize  this  data  in  conjunction  with  other  macromolecular 
information  [33].  This  often  takes  the  form  of  building  solution- state  models  start- 
ing from  domains  or  components  that  have  been  determined  independently  at  high 
resolution.  SAXS  data  has  also  proved  to  be  an  enormously  valuable  restraint  in 
NMR  structural  determinations  [20,  33].  An  example  of  the  latter  is  seen  in  the 
structural  determination  of  the  U2/U6  snRNA  complex  (Fig.  4.10)  [5]. 
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Assessment  of  the  quality  of  the  structural  models  derived  from  SAXS  is  an 
important  component  of  any  study  that  utilizes  this  technique  [31].  Once  an  envelope 
has  been  derived,  the  fit  of  an  ensemble  of  high  resolution  models  to  that  envelope  is 
usually  judged  by  its  normalized  spatial  discrepancy  (NSD)  [17].  For  satisfactory 
models  NSD  should  be  less  than  0.8.  Several  criteria  have  been  developed  for  mea- 
suring how  well  the  experimental  data  itself  fits  the  model  [31].  One  of  these  is  the 
normalized  discrepancy  function  %2,  which  is  defined  as: 


x  = 


N. 


-X 


/(g,)exp-c/(g,)cofe 


where  c  is  a  scaling  factor,  c(qi)  is  the  experimental  error,  and  Np  is  the  number  of 
observations.  This  function  is  minimized  during  the  construction  of  ab  initio  mod- 
els, and  ideally,  should  be  around  1.5  for  an  optimal  model.  Values  of  x2  less  than 
unity  indicate  over-fitting  of  the  model  to  the  data  [20,  38]. 


4.3    Neutron  Scattering  and  Diffraction  Methods 


Neutron  scattering  provides  complementary  information  to  that  derived  from  X-ray 
scattering.  The  major  difference  is  that  neutrons  are  scattered  by  the  nucleus  while 
X-rays  are  scattered  by  electrons.  Furthermore,  the  extent  of  scattering  is  related  to 
the  nuclear  structure  rather  than  just  the  number  of  nucleons,  where  this  is  defined 
in  terms  of  a  scattering  length  for  each  atom.  For  example,  hydrogen,  deuterium, 
carbon,  nitrogen,  and  oxygen  have  scattering  lengths  of  -3.7,  6.7,  6.7,  9.4,  and 
5.8  x  10-13  cm,  respectively.  This  means  that  hydrogen  and  deuterium,  in  an  absolute 
sense,  scatter  with  similar  efficiency  to  the  other  atoms  in  the  polypeptide  chain,  and 
implies  that  hydrogen  and  deuterium  atoms  should  be  readily  visible  in  Fourier 
synthesis  maps  (Fig.  4.11a,  b).  This  is  in  contrast  to  electron  density  maps  derived 
from  X-ray  crystallography  where  hydrogen  atoms  are  usually  not  observed,  except 
in  ultrahigh  resolution  X-ray  studies,  due  to  the  weak  scattering  power  of  a  single 
electron.  The  negative  sign  for  the  scattering  length  for  hydrogen  compared  to  other 
atoms  means  that  hydrogen  nuclei  appear  as  negative  density  in  a  Fourier  map  and 
are  thus  readily  discernable  from  other  atoms.  Another  benefit  of  neutron  diffraction 
is  that  the  intensity  of  scattering  does  not  fall  off  at  large  scattering  angles  because 
the  scattering  nuclei  are  so  small.  There  are  two  related  factors  that  limit  the  appli- 
cation of  neutron  scattering  to  structural  biology.  The  first  is  that  there  are  only  a 
small  number  of  facilities  that  can  provide  a  beam  of  thermal  neutrons  that  have  a 
wavelength  of  ~1  A,  and  the  second  is  that  the  intensity  of  the  neutron  sources  them- 
selves is  limited  compared  to  X-ray  sources.  The  latter  factor  generally  demands 
large  samples  and  lengthy  data  collection  times  in  comparison  to  those  required  for 
X-ray  scattering. 
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Fig.  4.11  Neutron  diffraction  reveals  the  location  of  hydrogen  atoms,  (a,  b)  The  neutron  density 
for  elastase  contoured  at  positive  and  negative  density  levels  [39].  (c)  The  neutron  density  for 
rubredoxin  [23].  The  crystals  of  elastase  were  grown  from  protein  isolated  from  normal  water  and 
transferred  to  D20,  and  the  neutron  data  were  recorded  to  1.65  A  resolution.  Here,  protons  replaced 
by  deuterons  show  as  positive  neutron  density  due  to  the  positive  scattering  factor  for  deuterium. 
Conversely,  non-exchangeable  hydrogen  atoms  show  as  negative  density  due  to  their  negative  scat- 
tering length  (b).  This  study  clearly  shows  that  a  deuterium  atom  is  in  close  proximity  to  the  ring 
nitrogen  of  the  histidine  and  is  not  participating  in  a  low  barrier  hydrogen  bond,  (c)  The  neutron 
density  from  a  perdeuterated  sample  of  rubredoxin  that  was  recorded  from  a  14  h  data  collection 
run  demonstrating  that  recent  developments  in  detector  technology  and  improvements  in  neutron 
sources  make  neutron  crystallography  an  exciting  option  for  locating  hydrogen  atoms  in  macro- 
molecules  (PDB  accession  numbers  shown  in  parentheses) 


4.4    Neutron  Crystallography 


In  general,  neutron  crystallography  is  applied  to  problems  for  which  the  structure 
has  already  been  determined  by  X-ray  crystallography.  There  are  three  reasons  for 
this.  First,  it  is  difficult  to  solve  a  structure  by  neutron  scattering  since  the  scattering 
length  of  metals  or  other  elements  that  might  be  substituted  is  not  very  different 
from  that  of  carbon,  nitrogen,  or  oxygen.  Consequently,  it  is  difficult  to  solve  the 
phase  problem  by  neutron  diffraction.  Second,  access  to  neutron  facilities  is  limited 
and  the  sources  are  comparatively  weak  compared  to  synchrotron  sources.  Third, 
large  crystals  by  X-ray  standards  are  generally  required  in  order  to  enhance  the 
amplitude  of  the  scattered  radiation.  Crystals  with  a  volume  of  over  1  mm3  were 
originally  required,  though  the  new  radiation  sources  have  reduced  this  volume  con- 
siderably [23,  24].  In  many  instances  it  is  challenging  to  grow  very  large  crystals. 
Another  problem  that  is  not  often  appreciated  is  that  neutron  beams  are  not  mono- 
chromatic (single  wavelength),  but  exhibit  a  range  of  wavelengths.  This  is  necessary 
in  order  to  obtain  a  high  flux,  but  causes  overlap  between  adjacent  reflections  (mea- 
surements), which  in  turn  reduces  the  completeness  of  the  data.  Even  with  these 
limitations,  neutron  crystallography  has  made  significant  contributions  to  macro- 
molecular  structure,  particularly  in  those  instances  where  the  protonation  state  of  an 
amino  acid  or  ligand  is  important. 

A  good  example  of  the  value  of  neutron  crystallography  is  seen  in  the  study  of 
serine  proteases.  This  class  of  proteases  utilizes  a  catalytic  triad  of  residues  in  their 
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active  site.  This  consists  of  an  aspartic  acid,  histidine,  and  serine  residue  where  the 
serine  functions  as  the  nucleophile  that  attacks  the  substrate  carbonyl  moiety. 
The  histidine  functions  as  a  catalytic  base  and  acid  in  the  mechanism.  A  question 
has  surrounded  the  role  of  the  aspartate  residue  and  whether  a  low  barrier  hydrogen 
bond  exists  between  the  histidine  and  aspartate  side  chains.  There  is  no  question  that 
these  two  residues  are  connected  by  strong  hydrogen  bond,  but  the  exact  location  of 
the  hydrogen  in  that  bond  has  been  the  subject  of  considerable  discussion.  A  low 
barrier  hydrogen  bond  requires  that  the  hydrogen  be  close  to  the  middle  or  shared 
by  the  carboxylate  oxygen  of  the  aspartate  and  the  side  chain  nitrogen  of  the  histi- 
dine. Despite  several  ultrahigh  resolution  X-ray  crystallographic  studies  of  serine 
proteins,  the  controversy  has  continued.  A  recent  combined  X-ray  and  neutron 
study  clearly  shows  that  in  elastase,  when  complexed  with  a  transition  state  analog, 
the  hydrogen  bond,  is  2.6  A  long  where  the  hydrogen  is  0.80-0.96  A  from  the  his- 
tidine nitrogen  (Fig.  4.1  la)  [39].  This  position  is  consistent  with  a  short  but  conven- 
tional hydrogen  bond,  and  is  not  consistent  with  a  low  barrier  hydrogen  bond.  The 
ability  to  observe  the  hydrogen  (deuterium  in  this  case)  resolves  the  question  of 
whether  or  not  a  low  barrier  hydrogen  bond  is  present  in  this  particular  complex.  It 
does  not  answer  the  question  whether  a  low  barrier  hydrogen  bond  ever  exists  in 
serine  proteases,  so  the  discussion  continues. 

Generally,  macromolecules  are  perdeuterated  where  possible  or  at  least  trans- 
ferred to  D20  for  data  collection.  This  is  necessary  because  hydrogen  has  an  anoma- 
lously large  incoherent  scattering  cross  section  compared  to  other  nuclei.  In  nuclear 
scattering  the  unit  of  measurement  is  the  "barn"  (10~28  m2),  which  was  originally 
defined  as  the  area  cross  section  of  a  uranium  nucleus.  The  nuclear  cross  section 
varies  dramatically  from  one  isotope  to  another  for  the  same  element  and  is  not 
related  simply  to  the  number  of  nucleons.  Thus,  the  scattering  cross  section  for 
hydrogen  is  -80  barns,  whereas  that  for  deuterium,  nitrogen,  and  carbon  is  zero, 
0.49,  and  zero,  respectively.  This  incoherent  scattering  from  hydrogen  results  in  a 
large  increase  in  the  background.  Full  deuteration  requires  expression  systems 
adapted  to  growth  in  D20  and  deuterated  carbon  sources.  Together  these  increase 
the  signal-to-noise  ratio  in  the  diffraction  pattern  and  allow  for  smaller  sample  sizes. 
Considerable  improvements  have  also  been  made  in  detector  technology  and  neu- 
tron sources  [13,  23].  Consequently,  increased  application  of  neutron  crystallogra- 
phy can  be  anticipated  in  future  years.  An  example  of  these  improvements  is  seen  in 
the  electron  density  for  rubredoxin  that  was  recorded  in  14  h  to  1.5  A  resolution 
(Fig.  4.11c). 


4.5    Small- Angle  Neutron  Scattering 

The  concepts  described  for  SAXS  can  be  applied  to  neutron  small-angle  scattering. 
However,  in  this  case  the  difference  in  sign  for  the  scattering  length  for  hydrogen  and 
deuterium  atoms  means  that  it  is  possible  to  adjust  the  scattering  density  or  contrast 
of  the  macromolecule  relative  to  the  solvent  by  varying  the  ratio  of  H20  to  D20. 
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This  implies  that  a  sample  comprising  two  polymers  can  be  prepared  to  distinguish 
the  location  and  conformation  of  each  individual  chain.  In  such  a  case,  one  chain  is 
hydrogenated  and  the  other  is  perdeuterated.  By  adjusting  the  solvent  contrast  to 
match  that  of  the  hydrogenated  sample,  one  can  observe  the  scattering  of  a  single 
polymer.  This  has  profound  implications  for  the  examination  of  multicomponent 
macromolecules  by  [19].  Use  of  this  powerful  technique  is  limited  by  access  to  suit- 
able facilities. 


4.6  Summary 

Elastic  scattering  techniques  by  X-rays  and  neutrons  have  had  a  profound  influence 
on  the  development  of  the  fundamentals  of  biology.  These  constitute  a  versatile  set 
of  tools  that  can  provide  information  over  a  wide  range  of  scales  of  both  resolution 
and  molecular  size.  Understanding  the  limitations  of  these  methods  allows  the 
information  they  yield  to  be  incorporated  appropriately  into  molecular  studies  of 
biological  systems. 
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Chapter  5 

Nuclear  Magnetic  Resonance  Spectroscopy 

Thomas  C.  Pochapsky  and  Susan  Sondej  Pochapsky 


Abstract  Nuclear  magnetic  resonance  (NMR)  has  developed  into  an  important 
tool  for  investigating  the  structure  and  dynamics  of  biomacromolecules  in  solution, 
associated  with  membranes  and  in  solids.  This  chapter  provides  an  introduction  to 
the  theory  of  NMR  and  a  description  of  basic  concepts  (excitation  of  NMR  transi- 
tions, spin  populations  and  coherence,  relaxation  phenomena,  signal  detection  and 
processing).  Types  of  structural  and  dynamic  information  available  from  NMR 
experiments  are  noted.  Standard  experiments  used  for  sequential  assignment  of 
resonances  in  biomolecules  in  solution  and  solid  state  are  discussed,  along  with 
instrumentation  and  sample  requirements.  In  particular,  the  need  for  selective  and 
uniform  isotope  labeling  is  detailed.  Software  used  to  process  NMR  data  and  gener- 
ate structural  and  dynamic  information  are  noted,  and  data  needed  for  structure 
determinations  and  dynamic  analysis  outlined. 
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5.1    What  Is  Nuclear  Magnetic  Resonance  Spectroscopy? 

Perhaps  no  single  technique  since  the  advent  of  protein  X-ray  crystallography  has 
made  as  great  an  impact  on  our  understanding  of  biomolecular  structure  and  dynam- 
ics as  nuclear  magnetic  resonance  (NMR)  spectroscopy.  While  a  thorough  technical 
description  of  NMR  is  beyond  the  scope  of  what  can  be  accomplished  in  a  single 
chapter,  we  will  attempt  to  provide  readers  with  a  sufficient  overview  of  the  poten- 
tial of  NMR  in  biophysical  research  that  they  can  judge  for  themselves  whether  it 
might  be  usefully  applied  to  their  own  work.  For  those  desiring  a  more  detailed 
description  of  the  technical  and  theoretical  aspects  of  NMR,  we  direct  them  to  our 
textbook  on  the  subject  [1]. 

NMR  makes  use  of  radio  frequencies  (RF)  between  -10  and  1,000  MHz  to  probe 
the  transitions  of  nuclear  spins  in  the  presence  of  a  static  magnetic  field.  Nuclear 
spin,  designated  by  a  quantum  number  /,  is  determined  by  the  relative  number  and 
arrangement  of  protons  and  neutrons  in  the  atomic  nucleus  and,  as  such,  depends 
upon  which  isotope  of  an  element  is  present.  For  example,  hydrogen  has  three  iso- 
topes, protium  (1H,  I=Vi),  deuterium  (2H,  1=1)  and  tritium  (3H,  I  =¥2).  If  /  is  not 
equal  to  zero,  the  magnetic  quantum  number  m,  with  allowed  values  of  -/+ 1, 
...,7-1,7,  determines  in  which  spin  state  the  nucleus  resides.  For  example,  XH  and 
3H  can  each  occupy  either  of  two  spin  states,  m  =  -l/2  and  m  =  1/2,  while  2H  has  three 
allowed  states,  m=-l,  m  =  0  and  m  =  1.  In  the  absence  of  a  magnetic  field  (or  in  the 
case  of  2H,  an  electric  field  gradient),  all  of  the  allowed  spin  states  are  degenerate, 
or  energetically  equal.  However,  in  the  presence  of  a  static  magnetic  field,  the  spin 
states  become  nondegenerate,  and  transitions  between  those  states  can  be  probed 
spectroscopically.  The  degree  of  (Zeeman)  splitting  between  nondegenerate  spin 
states  is  determined  by  (5.1): 

2/rv  =  co  =  yB()  (5.1) 

where  v  is  the  transition  frequency  in  s_1  (Hertz  or  Hz),  00  is  the  transition  fre- 
quency (also  called  the  Larmor frequency)  in  radial  units  (radians/s),  BQ  the  strength 
of  the  magnetic  field  (expressed  in  Tesla),  and  y  the  gyromagnetic  ratio,  a  constant 
that  depends  upon  the  identity  of  the  nuclide.  As  such,  the  frequency  of  an  NMR 
transition  depends  upon  both  the  strength  of  the  applied  magnetic  field  and  identity 
of  the  nucleus  being  observed.  Figure  5.1  shows  the  Zeeman  splitting  for  the  three 
isotopes  of  hydrogen  as  a  function  of  magnetic  field  strength.  Because  the  signal 
to  noise  ratio  (S/N)  in  the  NMR  experiment  is  proportional  to  ^y3B3  ,  all  other 
things  being  equal,  the  highest  S/N  is  obtained  by  observing  the  nucleus  present 
with  the  largest  y  at  the  highest  available  magnetic  field  strength.  As  can  be  seen 
from  Fig.  5.1,  3H  (which  is  radioactive)  has  the  largest  y  of  the  three  hydrogen 
isotopes,  and  at  11.74  T  has  a  transition  frequency  of  533.3  MHz.  However,  XH  is 
not  much  lower  in  frequency  than  tritium,  resonating  at  500  MHz  at  the  same  field 
strength,  and  as  *H  is  abundant  and  not  radioactive,  it  is  the  nuclide  of  choice  for 
hydrogen  NMR.  Deuterium  is  not  only  less  sensitive  than  either  !H  or  3H,  it  is  also 
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a  nuclear  quadrupole  (I>V2),  which  introduces  some  complications  for  the  NMR 
experiment  that  will  not  be  discussed  in  detail  here.  In  fact,  *H  has  the  highest  y  of 
any  stable  nuclide,  and  based  on  its  abundance  in  biomolecules,  is  usually  the 
nucleus  of  choice  for  detection  in  solution  biomolecular  NMR  experiments.  We  will 
see  that  this  is  not  the  case  for  solid-state  NMR,  for  reasons  to  be  discussed  later 
in  this  chapter. 


5.2    Excitation  of  NMR  Transitions 

Nuclear  Zeeman  splitting  in  standard  NMR  magnets  yields  transition  frequencies 
between  -10  and  1,000  MHz,  which  lie  in  the  radio  frequency  (RF)  range  of  the 
electromagnetic  spectrum,  and  excitation  of  NMR  transitions  on  modern  spectrom- 
eters is  almost  exclusively  performed  using  RF  pulses.  An  RF  pulse  consists  of  a 
short  burst  of  RF  energy  defined  by  the  frequency  of  the  RF  used  to  generate  it  (vrf, 
called  the  carrier  frequency ,  often  but  not  exclusively  the  RF  frequency  near  the 
center  of  the  spectral  region  of  interest),  the  amplitude  of  the  RF  (usually  measured 
in  decibels  of  attenuation  of  the  full  power  available  from  the  RF  amplifier  hard- 
ware), the  shape  of  the  pulse  (i.e.,  amplitude  as  a  function  of  time)  and  the  pulse 
duration  tp.  Despite  the  fact  that  the  pulse  is  generated  from  a  single  RF  frequency 
(vrf),  the  finite  duration  of  the  pulse  results  in  "blurring,"  so  that  a  range  of  frequen- 
cies on  either  side  of  the  carrier  frequency  are  also  excited  by  the  pulse.  The  excita- 
tion bandwidth  of  a  rectangular  pulse  with  duration  xp  (in  seconds)  is  determined  by 
the  expression: 


^  =  Vnull-VRF  =Tp- 


(5.2) 
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Av  is  the  distance  in  Hz  between  the  center  frequency  of  the  pulse  and  the  first 
excitation  null,  vnuU.  Note  that  (5.2)  is  an  inverse  relationship.  The  shorter  the  pulse 
duration,  the  wider  the  range  of  frequencies  it  excites.  Thus,  for  selective  excitation 
(where  only  a  narrow  spectral  region  is  excited)  a  long-duration  pulse  is  applied, 
while  for  broadband  excitation  of  all  or  most  of  spectral  region  of  interest,  a  shorter 
pulse  is  used.  Figure  5.2  shows  graphically  the  relationship  between  pulse  length 
and  excitation  bandwidth.  Another  important  variable  is  pulse  power,  a  measure  of 
energy  input  into  the  sample  (and  probe)  per  unit  time.  Too  much  power  over  too 
long  a  time  can  damage  the  delicate  components  that  make  up  the  NMR  probe  or 
destroy  the  sample  through  dielectric  heating.  Thus,  longer  pulses  at  lower  power 
are  occasionally  preferred  over  their  shorter,  higher  power  counterparts. 

Another  defining  characteristic  of  the  RF  pulse  is  pulse  phase.  Any  wave  can  be 
described  in  terms  of  its  frequency  and  phase  shift,  which  defines  the  amplitude  of 
the  wave  at  the  origin  (t=0).  If  the  amplitude  of  a  sine  wave  at  the  beginning  of  the 
pulse  is  zero,  the  phase  shift  is  0.  However,  if  the  wave  att=0  has  maximum  ampli- 
tude AOJ  the  phase  of  the  wave  has  shifted  by  90°,  or  %/2  rad.  A  further  90°  phase 
shift  (180°  or  n  radians  total  shift)  results  in  a  return  to  zero  amplitude.  A  270° 
phase  shift  results  in  a  negative  amplitude  (-AQ)  at  ^=0,  while  a  360°  (2ft rad)  shift 
returns  to  the  beginning  of  the  cycle.  We  will  find  that  pulse  phase  plays  a  critical 
role  in  selecting  particular  pathways  of  excitation  in  multiple-pulse  NMR 
experiments. 

What  happens  when  an  RF  pulse  is  applied  to  a  sample  in  the  NMR  spectrome- 
ter? Recall  that  the  energy  provided  by  the  pulse  is  absorbed  by  nuclear  spins  in  the 
sample  undergoing  a  transition  from  a  lower  energy  to  a  higher  energy  state  as 
determined  by  (5.1).  In  order  to  absorb  the  energy  provided  by  the  RF,  the  nuclear 
spins  must  be  in  phase  with  the  applied  RF.  To  visualize  this,  consider  the  individual 
nuclear  spin  dipoles  as  being  randomly  oriented,  but  precessing  around  the  applied 
magnetic  field  at  their  transition  frequency  (called  the  Larmor frequency).  Boltzmann 
weighting  of  the  spin  dipole  vectors  implies  that  there  is  a  net  magnetization  M 
along  the+z  axis  (aligned  with  the  applied  field  BQ).  When  an  RF  pulse  is  applied, 
the  magnetic  component  of  the  RF  results  in  a  "tipping"  of  the  net  magnetization  M 
away  from  the  z  axis,  generating  a  coherent  ensemble  (Fig.  5.3)  [2],  that  induces  a 
macroscopic  RF  signal  at  the  Larmor  frequency,  which  is  detected  in  the  NMR 
experiment. 

The  RF  signal  detected  in  an  NMR  experiment  appears  as  a  (weak)  time-domain 
oscillating  current  at  the  Larmor  frequency  in  the  receiver  coil  of  the  NMR  probe. 
This  time-domain  oscillation  (called  a  free-induction  decay,  or  FID)  is  passed  to  a 
device  called  a  "mixer,"  which  subtracts  the  observed  frequency  from  standard  car- 
rier frequency  vRF,  shifting  the  very  high-frequency  NMR  signal  down  into  the  audio 
range  so  that  it  can  be  easily  handled  by  standard  electronics.  The  FID  is  amplified, 
digitized  (so  that  it  can  be  handled  by  a  computer)  and  then  subjected  to  Fourier 
transformation,  a  mathematical  operation  that  converts  the  time-domain  FID  to  a 
frequency-domain  spectrum. 
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Fig.  5.2  Inverse  relationship  between  pulse  length  and  excitation  bandwidth.  Amplitude  (pulse  or 
absolute  value  of  excitation  envelope)  is  in  arbitrary  units  on  the  vertical  axis.  Note  that  the  hori- 
zontal axis  is  in  arbitrary  units  of  time  for  pulse  length  (time),  represented  by  shaded  rectangles, 
or  time-1  (frequency)  for  power  spectra.  Top:  short  duration  pulse  (three  arbitrary  time  points, 
indicated  by  shaded  rectangle),  with  power  spectrum  plotted  in  units  of  radians/time  point.  Nulls 
in  the  excitation  envelope  are  located  at  -85  rad/time  and  425  rad/time.  Bottom:  long-duration  (200 
time  points,  indicated  by  shaded  rectangle)  pulse,  with  nulls  located  close  to  the  center  of  the  fre- 
quency plot 
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Fig.  5.3  Generation  of  a  coherent  ensemble  by  an  RF  pulse  in  the  NMR  experiment.  Top  left:  an 
incoherent  ensemble  of  spin  dipoles  (light  lines),  oriented  randomly  and  precessing  at  the  Larmor 
frequency  around  the  applied  field  B0.  Boltzmann  weighting  results  in  a  net  magnetization  M  along 
+z.  Note  that  most  of  the  spins  are  still  randomly  oriented.  Top  right:  after  an  RF  pulse  is  applied, 
the  magnetic  component  of  the  RF  field  tilts  the  net  magnetization  away  from  +z,  still  precessing 
around  the  applied  field  B0  (precession  indicated  by  curved  arrows).  The  orthogonal  detector  cir- 
cuits detect  the  precession  as  an  oscillating  voltage,  generating  a  free-induction  decay  (bottom) 


5.3    Relaxation  of  NMR  Transitions 


5.3.1    Return  to  Equilibrium  Spin  Populations: 
Spin-Lattice  (Tj)  Relaxation 

NMR  spectroscopic  transitions  are  typically  much  lower  in  energy  than  the  thermal 
energy  available  from  the  environment  as  measured  by  kT  (Boltzmann 's  constant, 
£=1.380  6505x10- 34  JK1,  with  Tthe  absolute  temperature).  As  such,  NMR  is  a 
relatively  non-perturbing  spectroscopic  probe:  While  sample  heating  can  occur  if  care 
is  not  taken,  Boltzmann  populations  of  biomolecule  conformations  are  usually  not 
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perturbed  by  the  NMR  experiment,  and  vibrational  and  electronic  states  are 
unaffected.  The  price  of  the  non-perturbing  nature  of  NMR  is  that  it  is  very  insensi- 
tive relative  to  other  forms  of  molecular  spectroscopy,  and  NMR  experiments  typi- 
cally take  much  longer  to  run  than  other  types  of  spectroscopic  measurements. 
Because  the  transitions  involved  are  low  energy  (<^&I),  the  Boltzmann  population 
difference  between  spin  states  connected  by  an  NMR  transition  is  small,  unlike 
most  other  forms  of  molecular  spectroscopy,  in  which  the  ground  state  is  almost 
exclusively  populated  at  the  beginning  of  the  experiment.  However,  in  order  to 
obtain  an  NMR  signal,  there  must  be  some  population  difference  between  the  con- 
nected spin  states.  The  net  absorption  of  RF  energy  moves  spins  from  a  lower 
energy  (more  populated)  spin  state  to  a  higher  energy  state  until  the  populations  of 
the  two  states  are  equal,  at  which  point  no  further  net  RF  absorption  can  occur.  This 
phenomenon  is  called  saturation  of  a  transition,  and  at  this  point  the  experiment 
cannot  be  repeated  until  the  states  linked  by  the  transition  return  to  a  near-Boltzmann 
population  distribution.  The  process  by  which  energy  is  lost  to  the  surroundings, 
thus  restoring  an  equilibrium  distribution  of  spin  populations,  is  called  spin-lattice 
relaxation.  In  the  simplest  case,  spin-lattice  relaxation  can  be  fit  to  a  simple  expo- 
nential decay,  with  a  characteristic  relaxation  rate  constant  Rx.  The  inverse  of  this 
rate  constant  is  Tu  the  spin-lattice  relaxation  time,  and  can  be  thought  of  as  the  time 
required  for  the  NMR  signal  amplitude  to  be  restored  to  AJe,  where  e  is  the  base  of 
the  natural  logarithms  and  AQ  is  the  amplitude  of  the  signal  generated  by  equilibrium 
population  distributions  (that  is,  after  a  very  long  delay  between  successive  signal 
acquisitions).  Because  the  energy  differences  between  excited  and  ground  states  in 
NMR  are  very  small  (and  coupling  with  local  magnetic  field  fluctuations  inefficient 
as  a  result),  Tx  relaxation  times  of  nuclear  spins  commonly  observed  in  NMR  exper- 
iments can  often  be  quite  long,  on  the  order  of  a  second  or  more.  This  means  that 
the  rate  at  which  an  NMR  experiment  can  be  repeated  is  much  lower  than  for  other 
spectroscopic  methods,  so  that  signals  take  longer  to  build  up  to  usable  levels. 


5.3.2    Loss  of  Coherence:  Spin-Spin  (T2)  Relaxation 

A  second  limiting  factor  in  NMR  experiments  is  the  loss  of  coherence  over  time. 
Recall  that  the  coherent  ensemble  of  spins  that  produces  the  NMR  signal  is  gener- 
ated by  the  application  of  an  RF  pulse.  After  the  pulse,  the  spins  that  form  the  coher- 
ent ensemble  precess  in  phase  at  their  Larmor  frequency.  However,  local  magnetic 
field  fluctuations  due  to  local  spin-spin  interactions  result  in  the  decay  and  eventual 
loss  of  the  coherence.  Such  processes  do  not  necessarily  return  spins  to  their  equi- 
librium populations,  as  in  spin-lattice  relaxation,  but  only  remove  them  from  the 
coherent  ensemble.  Again,  this  loss  of  signal  can  often  be  modeled  as  a  simple 
exponential  decay  of  signal  intensity  with  time,  with  a  decay  constant  R2.  The 
inverse  of  R2  is  T2,  the  spin-spin  relaxation  time.  As  with  Tu  T2  can  be  considered 
as  the  time  required  for  a  coherence-induced  signal  in  the  NMR  detector  to  decay  in 
amplitude  from  the  original  AQ  to  AJe.  Clearly,  spin-lattice  relaxation  contributes  to 
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T2  (T2  is  never  longer  than  Tx  for  a  given  coherence)  but  time-dependent  local 
fluctuations  in  the  magnetic  field  that  have  nothing  to  do  with  spin-lattice  relaxation 
nevertheless  contribute  to  spin-spin  interactions  that  do  not  result  in  loss  of  energy 
to  the  environment,  but  only  to  loss  of  coherence.  We  will  see  later  that  the  frequen- 
cies of  these  local  fluctuations  provide  valuable  information  regarding  molecular 
dynamics,  and  can  be  extracted  from  experimentally  measured  relaxation  times. 
However,  at  present,  the  important  consideration  is  that  T2  processes  limit  the  life- 
time of  a  coherence  generated  in  an  NMR  experiment.  In  biological  macromole- 
cules,  T2  is  often  the  overriding  consideration  in  designing  multi-pulse  experiments, 
since  this  relaxation  time  determines  whether  the  coherence  generated  at  the  begin- 
ning of  an  experiment  will  survive  long  enough  to  be  detected  at  the  end. 

Experimentally,  T2  (or  at  least  an  approximate  value  T2)  can  be  estimated  for  a 
given  resonance  from  line  width.  Ideally,  absorptive  NMR  resonances  have 
Lorentzian  lineshapes,  with  the  width  of  a  resonance  at  half  height  in  Hz  (Av1/2) 
inversely  proportional  to  the  lifetime  of  the  coherence  from  which  it  arises  (5.3).  In 
other  words,  spectral  lines  are  narrower  for  longer-lived  coherences.  Thus,  a  line 
width  of  35  Hz  (typical  for  a  !H  resonance  in  a  protein)  corresponds  to  a  T2  of  9  ms. 
For  small  molecules,  line  widths  on  the  order  of  1  Hz  are  common  (T2  of  320  ms). 
Note  that  the  value  of  T2  includes  not  only  the  intrinsic  T2  relaxation  time  of  the 
coherence  but  experimental  contributions  as  well  (inhomogeneous  magnetic  fields, 
sample  inhomogeneities,  etc.). 


5.4    Structural  Information  Available  from  NMR 

Given  the  challenges  it  presents,  why  is  NMR  still  so  useful  and  important  for  bio- 
physics? It  turns  out  that  relatively  long  relaxation  times  are  both  the  bane  and  boon 
of  NMR.  As  we  show  above,  the  line  width  of  a  spectroscopic  transition  is  inversely 
proportional  to  the  lifetime  of  the  coherence  as  measured  by  T2.  For  small  molecules 
in  nonviscous  solution,  lifetimes  of  ~1  s  are  not  uncommon.  Even  for  biological 
macromolecules,  coherence  lifetimes  >50  ms  can  be  observed  under  favorable  cir- 
cumstances. This  leads  to  relatively  narrow  spectral  lines  and  a  high  degree  of  dis- 
tinguishability  for  transitions  that  are  close  in  energy.  In  UV/visible  and  IR 
spectroscopy,  excited  state  lifetimes  are  usually  in  the  ps-ns  range  (10~12-10-9  s), 
with  correspondingly  broader  lines,  and  poorer  distinguishability.  A  second  benefit 
of  relatively  long  coherence  lifetimes  in  NMR  is  that  coherences  generated  on  one 
spin  last  long  enough  to  be  transferred  via  coupling  to  other  spins,  and  the  results 
analyzed  in  terms  of  coherence  transfer  pathways  [3].  This  statement  is  italicized  to 
indicate  its  importance:  The  benefits  of  multidimensional  NMR  arise  from  the  fact 
that  spins  that  are  coupled  to  each  other  in  some  fashion  can  be  connected  by 
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Table  5.1  Gyromagnetic  ratios  (y),  sensitivity  relative  to  *H  and  resonance  frequencies  at  1 1.75  T 
for  nuclei  commonly  used  in  biomolecular  NMR 


Nuclide 

y(rad  s"1  T^xlO"7 

Sensitivity 

Frequency  (MHz)  at  11.74  T 

lH 

26.7510 

1.00000 

500.0 

2H  (quadrupole,  1=1) 

4.1064 

0.00362 

76.7 

13C 

6.7263 

0.01590 

125.7 

15N 

-2.7116 

0.00104 

50.7 

19p 

25.1665 

0.83262 

470.6 

31p 

10.8289 

0.06333 

202.6 

coherence  transfer  and  the  results  analyzed  in  terms  of  that  coupling.  How  coupling 
occurs  and  what  types  of  coherence  transfer  pathways  are  commonly  analyzed  will 
be  examined  in  more  detail  below. 

Finally,  NMR  benefits  from  the  sheer  numbers  of  available  spectroscopic  tar- 
gets. Besides  1H,  a  variety  of  other  NMR- active  isotopes  present  themselves  in  a 
typical  biomolecule.  Although  the  most  common  isotope  of  carbon,  12C,  has  no 
spin,  13C  (7  =  ¥2,  1.1  %  natural  abundance)  is  common  enough  that  it  is  also  a  spec- 
troscopic target  and  can  be  enriched  by  appropriate  synthetic  or  biosynthetic  means. 
The  same  is  true  for  15N  (7=1/2,  natural  abundance  <0.5  %),  which  is  commonly 
enriched  by  expression  of  proteins  on  minimal  media  containing  a  15N-labeled 
ammonium  salt.  31P  (I  =¥2, 100  %  natural  abundance),  while  having  some  drawbacks, 
is  a  potential  NMR  target  in  nucleic  acids.  19F  (I  =¥2,  100  %  natural  abundance)  can 
often  be  substituted  for  hydrogen  in  amino  acids  and  nucleotide  bases,  and  is  a  use- 
ful NMR  probe.  By  comparison,  UV-visible  spectroscopy  rarely  has  more  than  a 
few  distinguishable  transitions  in  a  biomolecule,  and  IR-based  techniques,  which 
have  far  more  spectroscopic  targets  than  even  NMR  (vibrational  modes),  suffer 
from  the  difficulty  of  detecting  coherence  transfer  between  coupled  modes  due  to 
short  excited  state  lifetimes. 


5.4.1    Chemical  Shift  and  Nuclear  Shielding 

Equation  5.1  allows  us  to  calculate  the  resonance  frequency  of  a  given  nuclide  as  a 
function  of  gyromagnetic  ratio  and  applied  magnetic  field  strength.  Table  5.1  lists 
some  commonly  observed  NMR  nuclides,  their  gyromagnetic  ratios  and  resonance 
frequencies  in  an  applied  magnetic  field  (B0)  of  1 1 .74  T.  In  general,  only  one  nuclide 
is  observed  at  a  time,  although  schemes  for  simultaneous  detection  of  more  than  one 
nuclide  have  been  described  [4-6].  From  Table  5.1,  it  is  expected  that  in  an  1 1.74  T 
magnetic  field,  *H  will  resonate  at  -500  MHz.  If  all  of  the  !H  spins  in  a  molecule 
resonated  at  exactly  the  same  frequency,  they  would  not  be  distinguishable  and 
NMR  would  not  be  particularly  useful.  However,  local  variations  in  the  magnetic 
field  arise  from  interactions  between  electrons  in  the  chemical  environment  sur- 
rounding the  observed  nuclear  spin  and  the  applied  field.  Depending  on  the  spatial 
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distribution  of  electron  density  surrounding  the  observed  nuclear  spin  relative  to  the 
applied  magnetic  field  BQ,  these  local  variations  can  either  increase  or  decrease  the 
net  magnetic  field  Bef£  sensed  by  the  nuclear  spin.  If  the  effect  is  additive  (BeS>B0), 
the  Zeeman  splitting  increases  slightly,  the  resonance  moves  to  a  higher  frequency, 
and  the  nucleus  is  de shielded.  If  B&&<BQ,  the  resonance  moves  to  a  lower  frequency, 
and  the  nuclear  spin  is  shielded.  Such  variation  in  resonant  frequencies  of  spins  as  a 
function  of  chemical  environment  is  referred  to  as  chemical  shift.  Chemical  shifts, 
5,  are  usually  reported  as  a  ratio  of  the  difference  in  resonance  frequency  of  the 
nucleus  in  question  (vobs)  relative  to  a  standard  vref: 

v  u  —  v  e 

<5obs  =  °bs     Kf  (5.4) 

^ref 

Chemical  shifts  tend  to  be  quite  small  relative  to  the  reference  frequency  vref, 
often  in  the  parts  per  million  (ppm)  range,  and  are  so  reported.  This  convention  has 
the  added  convenience  of  being  field-independent,  so  that  chemical  shifts  can  be 
compared  between  different  spectrometers  and  different  magnetic  field  strengths. 

Because  the  net  shielding/de- shielding  of  a  nuclear  spin  depends  on  the  spatial 
distribution  of  electron  density  around  the  nucleus  as  well  as  the  orientation  of  that 
density  relative  to  the  applied  magnetic  field  B0,  chemical  shift  is  determined  by  the 
shielding  tensor,  the  value  of  which  depends  upon  the  orientation  of  the  molecular 
frame  in  the  applied  magnetic  field.  For  samples  in  which  motions  are  restricted 
(solids,  liquid  crystals),  this  leads  to  a  spread  of  chemical  shifts  for  a  given  nuclear 
spin  over  a  range  determined  by  the  principal  components  of  the  shielding  tensor. 
However,  in  isotropic  liquids,  the  random  tumbling  of  the  molecule  averages  the 
chemical  shielding  over  all  orientations,  reducing  the  chemical  shift  to  a  scalar 
value  so  that  only  a  single  resonance  line  is  observed. 

The  most  commonly  observed  nuclei  in  biomolecules  are  1H,  13C,  and  15N.  Of  the 
three,  the  chemical  shift  of  lH  is  most  obviously  affected  by  environmental  factors 
(temperature,  pH,  hydrogen  bonding  and  shielding/de- shielding  from  nearby  aro- 
matic rings  known  as  ring  current  effects).  Indeed,  the  extent  of  folding  of  a  protein 
can  often  be  determined  qualitatively  by  a  quick  examination  of  the  XH  spectrum. 
A  folded  protein  generally  shows  greater  !H  chemical  shift  dispersion  than  the  same 
protein  unfolded,  since  close  packing  of  aliphatic  and  aromatic  amino  acid  side 
chains  give  rise  to  distinctive  upfield  (towards  lower  5)  shifts,  while  regular  hydro- 
gen bonding  patterns  in  secondary  structures  such  as  turns  and  p-sheets  yield  down- 
field  (towards  higher  5)  shifts  for  the  involved  amide  protons.  Note  from  Fig.  5.4 
that  aliphatic  protons  (attached  to  sp3  carbon)  resonate  between  ~0  and  5  ppm,  aro- 
matic (57?2-C-attached)  protons  tend  to  cluster  between  6  and  7  ppm,  while  amide 
protons  cover  a  range  between  6  and  10  ppm. 

The  chemical  shift  range  of  lH  is  limited  relative  to  other  nuclei  in  that  the 
allowed  electron  distribution  around  the  hydrogen  atom  (in  a  Is  orbital)  is  relatively 
spherical  and  not  easily  distorted.  13C,  on  the  other  hand,  is  much  more  sensitive  to 
bonding  and  hybridization  effects  than  1H.  The  shielding  patterns  and  electron  dis- 
tribution of  sp3  vs.  sp2  hybridization,  the  type  and  electronegativity  of  atoms  bonded 
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Fig.  5.4  !H  NMR  spectrum  of  folded  (left)  vs.  unfolded  (right)  cytochrome  P450cam  (CYP101  Al), 
showing  increased  spectral  dispersion  for  the  folded  46  kDa  enzyme 


to  the  observed  13C,  and  even  branching  patterns  and  dihedral  angles  formed  by 
attached  functionality  have  well-characterized  and  predictable  effects  on  13C  shifts 
[7].  Analysis  of  13C  chemical  shifts  as  a  way  of  predicting  or  determining  secondary 
structures  in  proteins  is  now  common,  and  multiple  programs  and  web-accessible 
resources  are  available  for  this  purpose,  as  discussed  below. 


5.4.2    Paramagnetic  Effects  on  Chemical  Shift 

The  above  generalizations  regarding  1H,  13C,  and  15N  shifts  hold  true  under  most 
circumstances  in  diamagnetic  molecules  (i.e.,  molecules  with  no  unpaired  elec- 
tronic spins  present).  However,  in  the  presence  of  a  metal  center  or  metal-containing 
prosthetic  group  with  unpaired  electronic  spins,  such  as  a  heme  or  iron-sulfur  clus- 
ter, the  effects  of  electronic  spin  on  chemical  shifts  are  often  quite  large.  In  solution 
NMR,  these  effects  fall  into  two  classes,  contact  and pseudocontact  shifts.  The  con- 
tact shift  results  from  overlap  between  orbitals  containing  unpaired  electron  spin 
density  and  those  involving  the  shifted  nuclear  spin,  and  is  therefore  usually 
restricted  to  spins  that  are  part  of  the  same  bonded  unit  as  the  paramagnetic  center. 
Contact  shifts  are  often  observed  to  1H,  15N,  and  13C  resonances  in  Fe(III)  and 
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high-spin  Fe(II)  hemes,  as  well  as  to  !H  and  15N  spins  that  are  involved  in  hydrogen 
bonding  or  other  direct  interactions  with  iron-sulfur  clusters  or  other  paramagnetic 
metal  centers  in  proteins. 

Pseudocontact  shifts  result  from  through- space  interactions  between  the  elec- 
tronic and  nuclear  spin  dipoles,  and  can  thus  be  observed  for  nuclear  spins  that  are 
not  directly  part  of  the  bonded  system  containing  the  unpaired  spin.  For  example,  XH 
spins  located  above  or  below  the  plane  of  a  heme  prosthetic  group  in  a  protein  often 
experience  exaggerated  shifts  due  to  pseudocontact  interactions. 

A  detailed  examination  of  NMR  of  paramagnetic  molecules  is  beyond  the  scope 
of  this  article.  For  the  interested  reader,  there  are  a  number  of  useful  monographs 
and  reviews  [8,  9].  However,  it  is  worth  noting  here  that  not  only  are  chemical  shifts 
affected  by  paramagnetism,  but  relaxation  tends  to  be  much  more  efficient  for 
nuclear  spins  in  the  presence  of  unpaired  electrons.  This  means  that  coherence 
transfer  experiments  are  typically  more  difficult  in  paramagnetic  systems,  because 
coherences  relax  too  quickly  to  be  efficiently  transferred  between  spins.  However, 
enhanced  relaxation  due  to  paramagnetism  has  been  used  to  advantage  by  the  intro- 
duction of  paramagnetic  centers  in  the  form  of  spin  labels  and  paramagnetic  metal 
ions  to  examine  protein-protein  interactions  [10],  oligomerization  [11],  surface 
exposure  [12],  and  characterization  of  membrane  proteins  [13]. 


5.4.3    J-Coupling  and  Coherence  Transfer — the  Basis 
of  Multidimensional  NMR 

As  discussed  above,  one  of  the  major  advantages  of  NMR  is  the  potential  for  creat- 
ing relatively  long-lived  coherences  on  one  spin  that  can  then  be  transferred  to  a 
coupled  spin  for  further  evolution  before  detection.  The  appropriate  excitation,  evo- 
lution, and  transfer  of  coherence  allow  the  correlation  between  coupled  spins  to  be 
established.  In  solution  NMR,  the  primary  mechanism  for  coherence  transfer  is 
/-coupling,  also  known  as  scalar  coupling,  /-coupling  results  from  the  interactions 
between  nuclear  spins  mediated  by  the  electrons  in  the  overlapping  orbitals  that  link 
the  coupled  nuclei.  Because  electrons  have  spin,  they  are  affected  by  the  spin  state 
of  the  nuclei,  changing  their  energies  slightly  depending  upon  whether  they  have  the 
same  or  opposite  spin  relative  to  the  nucleus  with  which  they  interact.  This  change 
in  energy  is  detected  by  nuclear  spins  that  are  part  of  the  same  bonded  system, 
resulting  in  slightly  different  energy  levels  (splitting)  that  depend  upon  the  spin 
states  of  coupled  nuclear  spins  in  the  same  system.  Thus,  a  15N-XH  bonded  pair  in  a 
peptide  bond  will  have  two  signals  for  each  nuclear  spin,  with  intensities  reflecting 
the  populations  of +1/2  and  -1/2  spins  of  the  coupling  partner.  The  separation  of  the 
signals  is  the  same  for  both  nuclei,  and  is  termed  the  coupling  constant.  The 
/-coupling  constant  is  represented  by  a  superscript  indicating  the  number  of  bonds 
by  which  the  coupled  spins  are  separated  and  a  subscript  indicating  which  atoms  are 
involved.  For  example,  the  NH  pair  in  an  amide  is  coupled  by  ^nh,  and  the  splitting 
observed  is  usually  on  the  order  of  94  Hz  (Fig.  5.5).  One-bond  couplings  are  usually 
the  largest  in  magnitude,  since  orbital  overlap  between  bonded  atoms  is  the  most 
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Fig.  5.5  Superimposed  coupled  (red),  decoupled  (green),  and  TROSY-HSQC  (blue)  spectral  cor- 
relations for  a  single  N-H  pair  in  an  enzyme.  The  four  red  peaks  comprise  the  correlation  in  the 
coupled  spectrum,  and  are  split  from  the  adjacent  peaks  in  both  dimensions  by  the  coupling 
(-90  Hz).  Decoupling  during  evolution  and  acquisition  results  in  a  collapse  of  the  four  peaks  into 
a  single  correlation  (green  central  peak),  simplifying  the  spectrum.  Note  that  the  lower  right  com- 
ponent is  the  narrowest  and  most  intense,  and  can  be  selected  exclusively  using  TROSY  phase 
cycling.  The  TROSY  spectrum  (blue)  is  superimposed  on  the  coupled  spectrum 


direct.  Two  protons  attached  to  the  same  carbon  (geminal  coupling)  are  coupled  to 
each  other  by  2/Hh,  ranging  from  5  to  15  Hz.  Two  protons  on  adjacent  carbons  show 
vicinal  couplings  3/Hh  between  0  and  15  Hz.  Because  of  the  requirement  that  orbit- 
als  overlap  for  /-coupling  to  occur,  three-bond  couplings  are  quite  sensitive  to  dihe- 
dral angle,  and  the  measured  values  for  3/-couplings  can  be  used  as  restraints  for 
calculations  of  polypeptide  and  polynucleic  acid  structures  [14,  15].  Weak  hetero- 
nuclear  couplings  can  sometimes  be  observed  even  across  a  strong  15N-1H  --0  =  13C 
hydrogen  bond,  as  in  nucleic  acid  base  pairing  [16]. 

This  leads  to  an  important  observation  for  coherence  transfer  in  NMR:  Coherence 
transfer  efficiency  is  directly  proportional  to  the  size  of  the  mutual  coupling  of  the 
spins  between  which  coherence  is  transferred.  This  is  a  critical  consideration  for 
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multistep  coherence  transfers,  such  as  those  found  in  three-dimensional  NMR 
experiments.  As  such,  one-bond  couplings  are  the  most  commonly  used  for  coher- 
ence transfer  in  multidimensional  NMR.  A  more  complete  description  of  such 
experiments  is  provided  later  in  this  article,  and  Scheme  5.1  lists  some  of  the  more 
common  couplings  used  in  biomolecular  NMR  for  coherence  transfer. 

As  !H  is  never  directly  bonded  to  another  lH  (except  in  H2),  /-correlation 
experiments  must  rely  on  2-  and  3-bond  couplings.  However,  given  the  relative 
abundance  and  sensitivity  of  XH,  this  is  not  as  large  a  drawback  as  it  is  for  other 
nuclei.  Indeed,  one  of  the  first  two-dimensional  NMR  experiments  to  enter  common 
use,  COSY  (for  correlated  spectroscopy),  showed  that  it  was  possible  to  map  com- 
plete !H  spin  systems  (e.g.,  the  side  chain  and  backbone  lH  resonances  of  a  single 
amino  acid  residue)  by  through-bond  correlations  for  small  proteins  and  nucleic 
acids  [17].  A  later  refinement  of  COSY,  double-quantum  filtered  (DQF)  COSY,  is 
still  frequently  used  for  characterizing  small  organic  molecules,  peptides,  and  oligo- 
nucleotides [18]  (Fig.  5.6). 

An  important  breakthrough  in  identifying  and  correlating  !H  spin  systems  was 
the  development  of  the  TOCSY  experiment  [19,  20]  Fig.  5.6.  TOCSY  (for  totally 
correlated  spectroscopy)  takes  advantage  of  the  ability  of  coupled  oscillators  with 
the  same  natural  frequency  to  transfer  energy  from  one  to  another.  Under  normal 
circumstances,  free  precession  of  coupled  coherences  at  their  respective  Larmor 
frequencies  prevents  this  type  of  transfer,  known  as  the  Hartmann-Hahn  effect. 
However,  if  a  relatively  long  low-power  RF  pulse  is  applied  appropriately,  a  phe- 
nomenon known  as  "spin-locking"  takes  place:  The  free  precession  of  individual 
coherences  is  halted,  and  components  of  the  coherences  are  forced  to  remain  in 
phase  with  each  other,  yielding  Hartmann-Hahn  matching.  Under  these  circum- 
stances, energy  transfer  occurs  between  coupled  coherences,  and  once  the  spin- 
locking  field  is  released  and  the  coherences  detected,  the  resulting  signal  amplitudes 
will  be  modulated  by  the  frequencies  of  all  of  the  spins  that  transferred  energy  to 
each  other  during  the  spin  lock.  The  TOCSY  experiment  has  the  advantage  over 
DQF-COS  Y  in  that  all  of  the  J-correlations  involving  a  particular  signal  are  observed 
simultaneously,  reducing  ambiguity  in  crowded  spectra. 

While  !H  is  the  only  naturally  abundant  spin  that  typically  occurs  in  multi-spin 
coupled  systems,  introduction  of  uniform  13C  labeling  into  a  biomolecule  allows 
similar  experiments  to  identify  complete  13C  spin  systems  as  well.  The  HCCH- 
TOCSY  experiment  [21],  which  will  be  described  more  completely  in  the  next  sec- 
tion, involves  spin  locking  of  the  13C  nuclei  within  a  coupled  spin  system,  so  that  all 
of  the  coupled  carbons  can  be  identified  in  a  single  data  set. 


5.4.4    Sequential  Resonance  Assignments,  Polarization 
Transfer,  and  Triple-Resonance  Experiments 

Most  of  the  value  of  biomolecular  NMR  can  only  be  realized  if  resonances  have  been 
assigned  to  a  specific  atom  in  the  macromolecule.  This  is  true  regardless  of  whether 
the  macromolecule  under  consideration  is  a  protein  or  a  nucleic  acid.  Knowing,  for 
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Scheme  5.1  Approximate  values  of  /-coupling  constants  commonly  used  for  coherence  transfer 
and  dihedral  angle  measurements  in  biomolecular  NMR.  Top,  protein,  bottom  RNA.  Values  for 
proteins  adapted  from  [154],  and  for  RNA  adapted  from  [155] 
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Fig.  5.6  Corresponding  regions  of  the  two-dimensional  800  MHz  *H  COSY  (top)  and  TOCSY 
(bottom)  spectra  of  mycinamicin,  a  macrolide  antibiotic.  Direct  correlations  that  are  obvious  in  the 
COSY  spectrum  are  either  less  intense  or  completely  absent  in  the  TOCSY  spectrum,  where  mul- 
tistep  coherence  transfers  predominate 
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example,  that  a  valine  spin  system  identified  by  HCCH-TOCSY  in  an  enzyme  is 
affected  by  adding  substrate  doesn't  help  much  if  one  doesn't  know  to  which  of  the 
35  valine  residues  in  the  enzyme  the  spin  system  belongs.  The  process  of  sequential 
resonance  assignment  is  therefore  a  critical  (if  sometimes  challenging)  part  of  the 
analysis  of  biomolecular  NMR  data.  A  useful  guide  as  to  whether  a  protein,  nucleic 
acid,  or  complex  can  be  assigned  by  NMR  methods  is  provided  by  the  Biological 
Magnetic  Resonance  Data  Bank  (BMRB,  www.bmrb.wisc.edu).  This  well-curated 
database  is  organized  so  as  to  allow  search  by  keyword  or  molecule  type,  as  well  as 
by  the  methodology  used,  and  so  is  a  convenient  first  stop  for  considering  the  viabil- 
ity of  a  project.  Deposition  of  experimental  NMR  data  (sequential  assignments, 
NOEs,  coupling  constants,  residual  dipolar  couplings  (RDCs),  etc.)  in  the  BMRB  is 
a  prerequisite  for  the  deposition  of  NMR-determined  biomolecule  structures  in  the 
RCSB  Protein  Data  Bank  (www.rcsb.org/pdb). 

While  homonuclear  ^H/H)  2D  NMR  experiments  are  still  widely  used  for  the 
characterization  of  peptides,  oligonucleotides,  and  small  proteins  [22],  sequential 
resonance  assignments  in  proteins  >5  kDa  are  made  almost  exclusively  using  uni- 
formly 15N  and  13C  labeled  samples  via  a  suite  of  triple-resonance  three-dimensional 
NMR  experiments  [23].  These  experiments  take  advantage  of  1-  and  2-bond 
/-couplings  between  15N,  13C  and  their  attached  protons  to  "connect  the  dots,"  that 
is,  determine  the  sequential  connectivity  of  the  individual  amino  acid  spin  systems 
via  couplings  involving  backbone  13C,  1H,  and  15N  resonances.  As  "triple-resonance" 
implies,  each  nucleus  is  resolved  along  one  axis  of  a  three-dimensional  data  set, 
with  the  observed  correlations  indicating  connectivity  between  a  particular  set  of 
resonances,  one  in  each  dimension. 

The  names  of  the  various  experiments  describe  the  particular  correlations  that 
are  observed  in  each.  For  example,  the  HNCA  experiment  [24,  25],  a  mainstay  of 
the  sequential  assignment  process,  correlates  an  amide  !H  with  the  15N  nitrogen  to 
which  it  is  bonded.  In  turn,  the  1-bond  coupling  between  that  amide  nitrogen  and 
the  Ca  carbons  of  the  residue  to  which  the  amide  belongs  as  well  as  the  2-bond 
coupling  to  the  Ca  of  the  previous  residue  (/-l)  serve  to  connect  two  amino  acid 
spin  systems.  The  HN(CO)CA  experiment,  on  the  other  hand,  correlates  the  amide 
N  with  only  the  i-1  Ca,  removing  any  ambiguity  as  to  which  Ca  correlation  is 
which  in  the  HNCA  experiment.  The  parentheses  around  (CO)  in  the  experiment 
name  mean  that  although  coherence  is  passed  through  the  carbonyl  carbon  from  the 
amide  to  the  preceding  Ca,  the  carbonyl  carbon  is  not  resolved  in  the  experiment, 
that  is,  the  three  axes  are  1H,  15N,  and  13Ca.  Another  pair  of  experiments,  HN(CO) 
CACB  and  HNCACB,  use  the  same  methodology,  but  extend  correlations  to  the  CP 
carbons,  one  carbon  atom  out  from  the  polypeptide  backbone  into  the  side  chains  of 
adjacent  amino  acids.  These  experiments  are  particularly  useful  for  sequential 
assignments,  inasmuch  as  the  combination  of  Ca  and  CP  shifts  are  often  diagnostic 
of  amino  acid  type,  while  either  shift  alone  is  less  so.  Figure  5.7  provides  represen- 
tative views  of  these  experiments  to  show  how  the  correlations  are  made. 

The  observant  reader  might  notice  that  all  of  these  experiments  involve  the  amide 
proton,  and  this  is  no  accident.  Because  lH  has  the  largest  gyromagnetic  ratio  of  the 
three  nuclei  under  observation,  it  is  advantageous  to  detect  XH  directly  for  the 
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Fig.  5.7  Strips  from  15N  planes  of  HNCA  (left)  and  HNCACB  (right)  showing  sequential  connectivities 
for  residues  386-390  of  cytochrome  P450cam,  a  414-residue  metalloenzyme.  The  15N  chemical  shift 
of  the  correlating  amide  NH  group  is  shown  at  the  top  of  each  strip,  the  NH  :H  chemical  shift  is 
shown  at  the  bottom,  with  13C  shifts  shown  along  the  right  side  of  each  set  of  strips.  Darker  correla- 
tions are  for  Coc  with  either  the  adjacent  NH  or  the  NH  of  the  following  residue.  Lighter  dotted  line 
correlations  show  similar  information  for  Cp 


greatest  sensitivity.  However,  the  !H  advantage  can  be  further  exploited  using  a  trick 
called  polarization  transfer.  Because  the  proton  has  the  largest  y  of  the  nuclei 
involved,  it  exhibits  the  largest  Boltzmann  population  difference  between  Zeeman 
states  at  a  given  magnetic  field.  By  applying  the  appropriate  series  of  simultaneous 
pulses  to  XH  and  15N,  it  is  possible  to  "sort"  the  15N  spins  by  the  spin  state  of  the 
attached  1H,  essentially  transferring  the  !H  population  difference  (and  the  corre- 
sponding improvement  in  sensitivity)  that  naturally  occurs  for  proton  to  the  attached 
15N  [26].  This  is  the  first  step  in  all  of  the  triple-resonance  experiments  noted  above. 
Polarization  is  transferred  from  amide  XH  to  amide  15N.  Subsequent  simultaneous 
pulses  on  15N  and  13C  pass  this  polarization  on  to  the  appropriate  13C  atoms.  At  each 
nucleus  along  the  way,  the  transferred  polarization  is  allowed  to  evolve  at  the 
Larmor  frequency  of  the  nucleus  on  which  it  resides,  thus  labeling  the  resulting 
coherence  with  the  chemical  shift  of  each  nucleus.  After  the  coherence  is  appropri- 
ately labeled,  it  is  passed  back  through  reverse  polarization  transfers  to  the  amide  XH 
from  which  it  started,  where  it  is  detected. 

The  pulse  sequence  of  an  HNCA  experiment  is  shown  in  Fig.  5.8,  with  the  vari- 
ous polarization  transfer  steps  marked.  Coherence  transfer  pathways  are  selected  by 
appropriate  pulse  and  receiver  phase  cycling,  and  the  application  of  pulsed  field 
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Fig.  5.8  Simplified  HNCA  pulse  sequence.  No  phase  cycling  or  pulsed  field  gradients  are  shown. 
Pulses  are  indicated  by  vertical  black  bars  on  the  RF  channels  on  which  they  are  applied.  Thinner 
bars  are  90°  pulses,  thicker  bars  are  180°  pulses,  (a)  Polarization  transfer  from  amide  *H  to  15N. 
(b)  Frequency  labeling  of  coherence  on  15N  during  independent  time  variable  tu  and  refocusing  of 
carbonyl  (CO)  coupling  with  15N.  (c)  Coherence  transfer  from  15N  to  13Coc  and  frequency  labeling 
of  coherence  on  13Coc  during  independent  time  variable  t2.  (d)  Return  coherence  transfer  from  13Coc 
to  15N.  (e)  Return  coherence  transfer  from  15N  to  *H  and  detection  during  independent  time  variable 
t3,  with  composite  pulse  decoupling  (CPD)  on  15N  during  !H  signal  acquisition.  Three  independent 
time  variables  (tu  t2,  and  t3)  transform  into  three  frequency  domains  in  three  dimensions  for  15N, 
13Coc,  and  lH 


gradients  (PFGs).  The  "frequency  labeling  periods"  during  which  chemical  shift 
evolution  occurs  are  called  tu  t2,  and  t3.  In  the  experiments  shown,  the  frequency- 
labeling  periods  tx  and  t2  change  in  duration  incrementally  throughout  the  course  of 
the  3D  experiment,  providing  a  time-domain  modulation  of  the  resulting  signal  that 
reflects  the  Larmor  frequencies  of  the  spins  evolving  during  that  frequency-labeling 
period.  The  third  period,  t3,  is  the  usual  acquisition  of  the  NMR  signal.  The  resulting 
time-domain  data  set  is  then  subjected  to  three  separate  Fourier  transforms,  one 
with  respect  to  each  of  the  three  frequency-domain  dimensions,  yielding  the  final 
three-dimensional  frequency-domain  spectrum.  The  HCCH-TOCSY  experiment 
mentioned  previously  also  begins  with  polarization  transfer,  in  this  case  between  *H 
and  the  directly  bonded  13C.  Thus,  the  coherence  that  is  passed  via  Hartmann-Hahn 
transfer  between  carbon  atoms  during  the  spin-lock  is  the  result  of  polarization 
transfer.  The  Hartmann-Hahn  transfer  therefore  benefits  from  the  improvement  in 
sensitivity.  A  related  experiment  is  CBCA(CO)NH,  in  which  initial  polarization 
transfer  occurs  from  !H  to  directly  bonded  13Coc  and  13Cp,  and  eventually  passed  to 
the  NH  of  the  (/+ 1)  residue,  yielding  information  similar  to  HN(CO)CACB. 
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Fig.  5.9  600  MHz  :H,15N  HSQC  spectrum  of  putidaredoxin,  a  106-residue  ferredoxin,  with  reso- 
nance assignments  labeled.  The  HSQC  spectrum  provides  a  simple  way  of  monitoring  changes  in 
protein  structure  as  a  function  of  perturbations,  and  is  often  called  a  "fingerprint"  of  the  protein 


Polarization  transfer  is  the  basis  of  what  is  probably  the  single  most  widely  used 
heteronuclear  NMR  experiment,  !H,15N  HSQC  (heteronuclear  single-quantum 
coherence)  [27].  While  not  directly  useful  for  sequential  assignments,  the  HSQC 
provides  a  "fingerprint"  of  the  protein,  that  can  be  used,  once  the  amide  N-H  correla- 
tions have  been  assigned  via  triple  resonance,  as  a  rapid  assay  for  folding  and  moni- 
toring of  local  structural  perturbations  in  the  protein.  With  the  exception  of  the 
N-terminal  residue,  each  non-proline  residue  in  a  protein  in  principle  should  give  rise 
to  a  unique  correlation  in  the  2D  HSQC  spectrum  at  the  chemical  shifts  of  the  amide 
!H  and  15N,  while  side  chain  NH2  groups  (for  Gin  and  Asn)  and  Arg  NeH  groups  give 
rise  to  distinctive  correlations  as  well.  Figure  5.9  shows  a  !H,15N  HSQC  experiment 
labeled  with  specific  resonance  assignments  for  a  small  protein,  putidaredoxin. 

The  experiments  described  above  are  used  primarily  for  protein  assignments  and 
are  not  useful  for  assigning  nucleic  acids  (RNA  and  DNA).  Because  of  the  greater 
degree  of  redundancy  in  nucleic  acid  spectra  (only  four  bases  as  opposed  to  20 
amino  acids)  it  is  often  advantageous  to  do  selective  (as  opposed  to  uniform)  label- 
ing of  samples  to  simplify  analysis  [28-30].  Furthermore,  because  the  covalent  link- 
ages between  nucleotides  are  via  phosphate  ester  linkages,  through-bond 
connectivity  is  more  difficult  to  obtain,  since  internucleotide  coherence  transfer 
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must  proceed  through  two-bond  and  three-bond  13C-31P  couplings  (HPC  experiment). 
For  this  reason,  nucleic  acid  sequential  assignments  are  typically  more  dependent 
on  internucleotide  nuclear  Overhauser  effects,  discussed  below,  than  on  coherence 
transfer  experiments  [28,  31]. 


5.4.5   Dipolar  Coupling  and  Nuclear  Overhauser  Effects 

In  addition  to  /-coupling,  another  mechanism  for  information  and  energy  transfer 
between  nuclear  spins  is  dipolar  coupling.  Unlike  /-coupling,  dipolar  coupling  is  a 
direct  through-space  interaction  between  two  spins,  and  so  does  not  require  that  the 
spins  be  part  of  the  same  bonded  system,  or  even  that  they  be  part  of  the  same  mole- 
cule. Dipolar  coupling  (/^-coupling)  only  requires  spatial  proximity  of  the  coupled 
spins.  In  the  case  of  XH  in  solution,  internuclear  distances  <5  A  are  typical  for  observ- 
able dipolar  interactions.  Consider  two  bar  magnets  held  close  to  each  other.  They 
clearly  exhibit  preferred  orientations  with  respect  to  each  other.  Similarly,  Z)-coupled 
nuclear  spins  also  affect  the  energy  levels  of  their  coupling  partners.  The  dipolar  cou- 
pling constant  D  between  two  spins  is  proportional  to  the  inverse  cube  of  the  distance 
between  the  coupled  spins,  r,  and  the  angle  0  made  by  the  internuclear  vector  r  with 
respect  to  the  applied  magnetic  field: 

^    3 cos2  0-1 

D   (5.5) 

r 

For  a  freely  tumbling  molecule  in  solution,  the  averaging  of  the  angle  0  over  all 
possible  orientations  means  that  there  is  no  splitting  of  resonances  due  to  the  dipolar 
coupling,  as  D  is  averaged  over  all  values,  positive  and  negative.  Despite  the  fact 
that  D  splitting  is  not  observed  in  solution,  dipolar  interactions  between  spins  are 
active,  and  provide  important  mechanisms  for  both  Tx  and  T2  relaxation.  Let  us 
examine  spin-spin  (T2)  relaxation.  Consider  a  coherence  generated  by  an  RF  pulse 
as  in  Fig.  5.3.  Dipolar  coupling  provides  a  mechanism  for  random  energy  exchange 
between  spins,  so  that  a  coherent  spin  may  change  its  spin  state  by  passing  energy 
along  to  a  /^-coupling  partner.  This  random  spin  state  change  (sometimes  called  a 
"spin  flip")  results  in  a  decrease  in  coherence,  that  contributes  to  T2  relaxation. 
Indeed,  this  type  of  /^-modulated  interaction  is  the  origin  of  the  term  spin-spin 
relaxation.  The  /^-modulated  interaction  between  spins  can  also  result  in  both  cou- 
pled spins  either  gaining  or  losing  energy  simultaneously  to  the  surroundings  (or 
lattice).  This  results  in  a  net  population  change  for  both  spins  (as  opposed  to  just  a 
loss  of  coherence)  and  in  this  case,  /^-coupling  contributes  to  Tx  relaxation  (return 
towards  an  equilibrium  distribution  of  spin  states)  as  well  as  T2. 

Because  /^-coupling  provides  a  mechanism  for  spin  energy  exchange  both 
between  coupled  spins  and  with  the  surroundings,  it  renders  the  state  populations 
(and  therefore  signal  amplitudes)  of  spins  dependent  upon  the  corresponding  state 
populations  of  their  /^-coupling  partners.  If,  for  example,  one  saturates  the  transi- 
tion of  one  spin  (by  applying  a  long  low-power  pulse  at  the  transition  frequency  of 
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Fig.  5.10  lH  NOESY  of  acireductone  dioxygenase  in  90  %  H20  obtained  at  750  MHz  !H.  Note 
the  high  degree  of  overlap  in  the  two-dimensional  spectrum 

that  spin),  the  signal  intensities  of  that  spin's  /^-coupling  partners  will  be  affected. 
This  change  in  signal  intensity  for  one  spin  upon  perturbing  the  spin  population  of 
a  /^-coupling  partner  is  termed  a  nuclear  Overhauser  effect  (NOE),  and  is  an  impor- 
tant source  of  structural  information  for  solution  NMR.  Before  the  advent  of  multi- 
dimensional NMR  methods,  NOEs  were  typically  measured  by  RF  saturation  of 
one  signal  in  the  NMR  spectrum  while  the  change  in  intensities  of  other  signals 
were  monitored,  thereby  identifying  spins  close  in  space  to  the  saturated  spin. 
However,  the  experiment  is  more  conveniently  accomplished  as  a  two-dimensional 
experiment  called  NOESY  [32].  As  in  other  two-dimensional  experiments,  the  first 
pulse  generates  coherence  that  evolves  during  the  frequency-labeling  period  tx.  This 
is  followed  by  a  second  pulse  that  initiates  the  mixing  time,  during  which  frequency- 
labeled  coherences  exchange  spin  energy  with  their  Z)-coupled  partners.  The  mixing 
time  can  vary  in  length  depending  upon  the  size  of  the  molecule  and  Tx  relaxation 
rates  of  the  various  spins,  but  mixing  times  of  50-100  ms  are  typical  for  proteins 
and  nucleic  acids.  Finally,  the  mixing  time  is  ended  with  another  pulse  that  gener- 
ates the  labeled  coherences  that  are  detected  during  t2.  Fourier  transformation  with 
respect  to  both  time  dimensions  yields  a  two-dimensional  spectrum  in  which  cou- 
pled spins  give  rise  to  cross-peaks  indicating  their  dipolar  connectivity  (Fig.  5.10). 

Typically,  NOESY  is  a  XH,  !H  correlation  experiment,  since  XH  is  usually  the  only 
spin  in  sufficient  abundance  to  form  Z)-coupled  networks  without  isotopic  labeling. 
Until  the  advent  of  triple-resonance  sequential  assignment  experiments,  the  NOESY 
experiment  formed  a  critical  piece  of  the  sequential  assignment  puzzle,  because  it  was 
the  only  way  to  get  connectivity  between  spin  systems  of  adjacent  amino  acid  residues 
in  a  protein  or  adjacent  bases  in  a  polynucleotide.  Furthermore,  identification  of  NOEs 
between  sequentially  nonadjacent  residues  forms  the  basis  of  three-dimensional 
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Fig.  5.11  Plane  from  13C-edited  NOESY  (800  MHz  JH)  of  HypA,  a  nickel  chaperone  from  H. 
pylori,  at  30.76  ppm,  corresponding  to  the  13Cp  shifts  of  the  four  cysteine  ligands  of  bound  Zn2+. 
Labeled  NOE  cross-peaks  indicate  close  approach  of  the  CpH2  groups  of  pairs  of  ligands 


structure  determination  of  biological  macromolecules  by  NMR.  NOE-based  distance 
restraints  are  incorporated  into  a  variety  of  computational  approaches  for  calculating 
structures,  and  while  other  sources  of  information  are  available  and  will  be  discussed, 
the  NOESY  experiment  remains  a  mainstay  of  biomolecular  structure  determination. 
For  larger,  more  complex  systems,  the  NOESY  spectrum  can  be  resolved  by  marrying 
the  NOESY  experiment  to  HSQC,  generating  a  three-dimensional  experiment, 
NOESY-HSQC  [33,  34].  Depending  upon  whether  the  HSQC  correlates  lU  to  13C  or 
15N,  one  can  identify  NOEs  observed  for  a  particular  lH  spin  resolved  according  to  the 
chemical  shift  of  the  13C  or  15N  to  which  the  XH  spin  is  bonded  (Fig.  5.11). 

The  sign  of  the  observed  NOE  (that  is,  whether  the  NOE  increases  the  signal  inten- 
sity of  coupling  partners  or  decreases  it)  depends  upon  which  mechanism  discussed 
above  predominates,  the  spin  flip  (exchange  of  spin  states  without  a  net  change  in 
overall  spin  populations),  or  simultaneous  gain  or  loss  of  spin  energy  by  both  coupled 
spins,  with  concomitant  change  in  overall  spin  populations.  Typically,  for  small 
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Fig.  5.12  Portion  of  the  800  MHz  *H  ROESY  spectrum  of  mycinamicin,  a  macrolactone  antibi- 
otic (-750  Da).  Anti-phase  (red-black  alternating)  cross-peaks  are  due  to  /-coupling,  while  pure 
phase  peaks  are  primarily  through- space  (ROE)  correlations 

molecules  in  nonviscous  solvents,  the  simultaneous  mechanism  predominates,  and 
direct  NOEs  are  positive  (i.e.,  signals  are  enhanced),  while  for  large  molecules  or 
molecules  in  viscous  solvents,  the  spin  flip  mechanism  predominates  and  NOE 
decreases  overall  signal  intensity.  In  small  molecules,  indirect  NOEs  (passed  from  one 
spin  to  another  via  a  common  Z)-coupled  partner)  tend  to  be  small  and  negative,  and 
can  be  distinguished  from  direct  effects  by  the  change  in  sign.  The  reasons  for  this 
observation  are  detailed  in  several  monographs  [35,  36].  On  the  other  hand,  for  most 
proteins  and  nucleic  acids,  both  direct  and  indirect  NOE  are  negative  in  sign  and  are 
not  readily  distinguishable.  This  does  not  matter  for  the  typical  NOESY  experiment 
with  relatively  short  mixing  times,  although  if  the  mixing  time  is  too  long,  the  infor- 
mation content  decreases  because  NOEs  can  be  observed  between  spins  that  are  not 
directly  Z)-coupled  to  each  other.  However,  a  major  issue  arises  for  molecules  that  fall 
in  the  molecular  weight  range  of  -1,000  Da,  in  that  the  two  mechanisms  balance  each 
other  and  often  no  NOE  is  observed. 

In  such  cases,  an  alternative  experiment,  called  ROESY  (for  rotating  frame  NOE1 
spectroscopy)  can  be  used  [37,  38].  During  the  mixing  time  of  ROESY,  a  spin-lock 
field  is  applied  to  the  precessing  spins,  much  as  in  the  TOCSY  experiment. 
Now  coherences  are  kept  in  phase  that  would  otherwise  evolve  and  Z)-coupled  spins 
can  cross-relax.  The  simultaneous  mechanism  predominates  for  all  molecular  weight 
ranges  in  the  ROESY  experiment,  so  all  ROEs  (rotating-frame  NOEs)  are  positive, 
even  for  molecules  in  the  1,000  Da  range  (Fig.  5.12).  Because  the  spin-lock  RF  field 
enables  TOCSY  transfers,  care  must  be  taken  to  distinguish  true  ROE  (/^-coupling 
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Fig.  5.13  Overlay  of  a  !H  TOCSY  spectrum  (blue)  onto  the  *H  NOESY  of  acireductone  dioxy- 
genase  (red).  Both  spectra  were  obtained  at  750  MHz  lH.  Region  shown  correlates  NH  (horizontal 
axis)  with  CocH  resonances  (vertical  axis).  TOCSY  peaks  correspond  to  intra-residue  through- 
bond  correlations,  while  NOESY  cross-peaks  indicate  both  intra-  and  inter-residue  connectivities. 
Superimposed  red-blue  peaks  indicate  an  intra-residue  connectivity,  while  those  only  in  red  are 
inter-residue 

effects)  from  TOCSY-type  /-coupling  transfers  in  ROESY  datasets.  Figure  5.13 
shows  an  overlay  of  a  TOCSY  and  NOESY  experiment  that  illustrates  this  concept. 


5.4. 6   Residual  Dipolar  Couplings  (RDCs ) 

We  will  find  that  /^-couplings  are  the  primary  mechanism  of  information  and  energy 
transfer  between  spins  in  solid-state  NMR  (SSNMR),  because  they  are  not  averaged 
to  zero  by  molecular  tumbling  as  they  are  in  solution.  In  fact,  /^-coupling  constants 
can  be  very  large  (on  the  order  of  kHz  for  directly  bonded  atoms,  as  opposed  to  tens 
to  hundreds  of  Hz  for  1-bond  /-couplings),  and  special  techniques  are  required  to 
simplify  solid-state  NMR  spectra  as  a  result  of  this.  Furthermore,  the  orientation 
dependence  of  the  /^-coupling  (due  to  the  3cos20-l  term  in  5.4)  is  an  important 
source  of  structural  information.  Each  NOE  exists  in  a  unique  frame  of  reference, 
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providing  a  distance  between  two  Z)-coupled  spins,  but  no  orientation  with  respect 
to  the  molecular  frame.  Consider  a  globular  protein  structure,  calculated  only  with 
NOE  and  secondary  structure  (/-coupling  and  chemical  shift)  restraints.  One  can 
think  of  the  NOE  as  a  rubber  band  connecting  two  atoms  in  the  structure,  holding 
them  together  with  a  flexible  tether.  If  enough  rubber  bands  are  used  to  hold  together 
disparate  parts  of  the  polypeptide,  a  reasonable  structure  can  be  obtained.  However, 
if  the  molecule  under  consideration  contains  multiple  domains,  does  not  contain 
compact  tertiary  structure  (as  in  DNA  [39])  or  contains  regions  that  for  some  reason 
do  not  yield  clean  NOE  connectivities,  the  relative  orientations  of  those  domains  or 
regions  remain  in  doubt.  In  this  case,  while  local  structural  features  (helices,  sheets, 
turns)  might  be  well  defined  structurally,  the  overall  arrangement  of  these  local 
features  in  the  tertiary  or  quaternary  structure  is  difficult  to  determine.  Under  these 
circumstances,  residual  dipolar  couplings  (RDCs)  become  an  important  source  of 
structural  information. 

RDCs  are  measured  by  placing  the  molecule  under  investigation  in  a  medium 
that  introduces  a  slight  orientational  preference  with  respect  to  the  applied  magnetic 
field.  Under  these  circumstances,  the  /^-coupling  is  no  longer  averaged  to  zero,  as  it 
is  in  an  isotropic  medium.  The  (highly  attenuated)  dipolar  coupling  can  then  be 
measured  as  a  variation  of  the  /-coupling  observed  between  the  coupled  spins  in  an 
isotropic  medium.  The  RDC  reflects  the  orientation  of  the  vector  connecting  the 
coupled  spins  with  respect  to  an  alignment  tensor,  which  is  usually  calculated  in  the 
course  of  structural  refinement.  Given  that  there  are  four  independent  solutions  to 
RDC  fittings,  there  is  considerable  ambiguity  associated  with  RDC  data  in  the  case 
of  an  a  priori  structure  determination  [40] .  This  problem  can  be  dealt  with  by  mea- 
suring RDCs  in  more  than  one  orienting  medium,  so  that  multiple  independent 
alignment  tensors  are  used  in  the  fitting. 

A  variety  of  orienting  media  have  been  tested  for  measuring  RDCs,  and  which  is 
chosen  often  depends  upon  the  physical  and  chemical  properties  of  the  molecule  of 
interest.  Indeed,  any  species  with  a  high  degree  of  magnetic  susceptibility  anisotropy 
(that  is,  the  energy  of  the  molecule  varies  depending  on  its  orientation  in  the  mag- 
netic field)  has  the  potential  to  provide  an  alignment  medium.  Some  alignment  media 
are  oriented  by  the  applied  field  (phage  particles  [41,  42],  nematic  liquid  crystals 
[43^-5]  and  bicelles/micelles  [46,  47]),  while  in  others,  particularly  stretched  and 
compressed  polyacrylamide  gels,  alignment  is  generated  mechanically.  Even  chemi- 
cal modification  can  be  used  for  orienting  macromolecules:  If  a  poly  dentate  ligand  is 
introduced  by  covalent  linkage,  a  paramagnetic  metal  ion  with  a  high  degree  of  mag- 
netic anisotropy  (e.g.,  Co(II))  can  be  used  to  obtain  alignment  [48].  Some  of  the 
earliest  characterizations  of  RDCs  were  made  using  heme  proteins,  in  which  the 
metalloporphyrin  exhibits  sufficient  magnetic  anisotropy  to  induce  a  degree  of  align- 
ment without  addition  of  an  external  alignment  medium  [49].  Indeed,  any  species 
that  has  nonzero  magnetic  susceptibility  anisotropy  will  likely  exhibit  some  align- 
ment; most  globular  proteins  fall  into  this  category  [50].  However,  the  degree  of 
alignment  is  almost  always  much  smaller  than  what  can  be  obtained  using  a  defined 
aligning  medium,  and  in  order  to  distinguish  the  RDC  from  /-coupling  and  other 
effects,  measurements  must  be  made  at  multiple  magnetic  field  strengths  using  spe- 
cialized experiments  for  precise  measurement  of  coupling  constants. 
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5.4.7   Solid-State  NMR  (SSNMR) 


The  first  NMR  experiments  were  performed  on  a  solid  sample  (paraffin  wax)  [51]. 
However,  for  many  years,  the  relative  simplicity  of  performing  and  analyzing  solu- 
tion NMR  experiments  meant  that  nonexperts  were  unwilling  to  venture  into 
SSNMR,  which  is  technically  demanding  and  more  difficult  for  the  nonexpert  to 
analyze.  The  reasons  for  this  are  understandable:  The  orientational  dependence  of 
the  chemical  shift  in  the  absence  of  isotropic  averaging  means  that  in  un-oriented 
("powder")  samples,  all  orientations  of  a  given  species  are  present,  so  each  spin 
gives  rise  to  a  complex  resonance  bounded  by  the  maximum  and  minimum  frequen- 
cies possible  given  the  chemical  shielding  tensor  of  the  spin  in  question  (Fig.  5.14). 
The  situation  is  further  complicated  by  the  presence  of  non-averaged  dipolar 
couplings.  As  noted  above,  the  dipolar  coupling  is  typically  much  larger  than 
/-couplings,  and  also  has  an  orientational  dependence,  so  that  the  /-coupling 
between  two  spins  (which  does  not  depend  upon  magnetic  field  strength  or  molecu- 
lar orientation  is  typically  swamped  by  much  larger  (and  variable)  dipolar  splittings 
(Fig.  5.15).  Finally,  while  T2  relaxation  is  usually  very  efficient  in  solid-state  sam- 
ples, leading  to  broad  lines,  particularly  for  lYL  spins,  Tx  is  generally  less  efficient, 
so  that  unless  special  methods  are  employed,  experiments  for  acquiring  data  of 
acceptable  signal  to  noise  are  longer  than  for  solution  methods.  Nevertheless,  it  is 
often  the  case  that  a  macromolecule  of  interest  is  unsuited  for  solution  NMR  meth- 
ods due  to  issues  such  as  molecular  weight,  spectral  complexity,  membrane  associa- 
tion, or  limited  solubility.  In  these  instances,  SSNMR  provides  some  distinct 
advantages,  and  developments  in  commercially  available  SSNMR  accessories  and 
implementations  of  SSNMR  experiments  have  made  SSNMR  more  accessible  to 
nonexperts.  Probably  the  two  most  important  such  developments  are  the  use  of 
magic-angle  spinning  and  cross-polarization  methods. 


5.4.8    Magic-Angle  Spinning 

Magic-angle  spinning  (MAS)  involves  spinning  a  randomly  oriented  (powder) 
SSNMR  sample  at  high  speeds,  with  the  spinning  axis  placed  at  an  angle  0  =  54.73° 
(the  "magic  angle")  with  respect  to  the  applied  magnetic  field.  MAS  accomplishes 
two  things.  First,  the  spinning  averages  all  orientations  of  the  molecules  in  the  sam- 
ple except  those  that  lie  along  the  spinning  axis.  The  observed  chemical  shift  aobs  of 
a  spin  is  related  to  0,  the  angle  between  the  applied  field  and  the  spinning  axis  by: 


where  oiso  is  the  isotropic  (solution)  shift  of  the  resonance  under  consideration  and 
o33  is  the  upfield  limit  of  the  chemical  shift  observed  in  the  powder  pattern  SSNMR 
spectrum.  Because  the  second  term  in  (5.6)  goes  to  zero  when  0  =  54.73°,  the 


(5.6) 
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Fig.  5.14  Effect  of 
magic- angle  spinning  on  the 
13C  SSNMR  spectrum  of 
glycine.  Note  that  as  spinning 
speed  increases  (shown  on 
the  right  of  each  spectrum), 
the  spacing  between  the  side 
bands  increase,  and  greater 
intensity  is  observed  for  the 
center  transition.  The  x-axis 
is  the  chemical  shift  of  13C. 
Courtesy  of  Prof.  Janet 
Blumel,  Texas  A&M 
University 
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Fig.  5.15  Top:  representation 
of  a  powder  pattern  SSNMR 
spectrum,  with  aiso  indicating 
the  position  of  the  isotropic 
(solution)  shift  of  the 
observed  resonance.  Bottom: 
doubling  due  to  dipolar 
coupling  D  between  two 
nuclei.  For  directly  bonded 
nuclei  and  protons  bonded  to 
the  same  atom,  the  coupling 
constant  D  can  be  on  the 
order  of  kHz 
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resonance  is  observed  at  oiso,  the  same  frequency  as  in  solution.  Secondly,  the  most 
significant  terms  contributing  to  the  dipolar  coupling  also  contain  the  form 
(3cos20- 1),  and  also  go  to  zero  with  MAS.  In  principle,  if  the  spinning  speed  is 
sufficiently  high,  a  spectrum  of  singlets  with  CSA  and  D-coupling  removed  is 
observed.  There  are  complications  for  MAS,  however.  First,  if  sample  spinning  is 
insufficiently  fast,  each  resonance  resolves  into  a  series  of  lines  separated  from  the 
central  band  oiso  by  the  spinning  frequency  (Fig.  5.14).  Secondly,  even  the  highest 
obtainable  spinning  rates  are  usually  insufficient  to  completely  average  the 
/^-coupling  between  XH  and  directly  bonded  heteronuclei  (e.g.,  15N,  13C).  This 
retained  coupling  results  in  considerable  line  broadening  of  the  heteronuclear  spins. 
As  such,  most  SSNMR  experiments  also  employ  high-power  XH  decoupling  during 
acquisition  of  heteronuclear  signals.  Such  decoupling  rapidly  interchanges  the  XH 
spin  states,  so  that  the  attached  heteronuclear  spin  "sees"  an  average  spin  state  for 
!H  and  the  dipolar  coupling  is  removed. 


5.4.9    Cross-Polarization  and  Selective  Reintroduction 
of  D-Couplings 

The  problems  presented  by  large  /^-couplings  and  efficient  T2  relaxation  of  *H 
means  that,  except  in  unusual  circumstances  (e.g.,  very  high  spinning  speeds  and 
dilution  of  *H  spins),  !H  is  not  the  observed  nucleus  in  SSNMR  experiments  on 
macromolecules.  However,  heteronuclei  such  as  13C  and  15N  have  the  same  draw- 
backs that  plague  their  direct  observation  in  solution  NMR,  namely  low  sensitivity 
and,  often,  long  Tx  relaxation  times.  To  circumvent  these  issues,  most  SSNMR 
experiments  incorporate  a  cross-polarization  step  in  which  the  Hartmann-Hahn 
condition  is  achieved  by  the  application  of  high-power,  relatively  long-duration  RF 
pulses  to  XH  and  the  heteronuclear  spins  simultaneously.  This  permits  lH  to  be  used 
as  a  reservoir  for  both  spin  polarization  and  7\  relaxation.  Tx  relaxation  is  relatively 
efficient  for  1H,  even  in  solids,  and  cross-polarization  allows  the  reservoir  of  *H 
spins  to  "drain"  non-Boltzmann  populations  from  Z)-coupled  heteronuclei,  so  that 
experiments  can  be  repeated  more  rapidly  for  an  improved  signal  to  noise  ratio.  The 
other  benefit  of  cross-polarization  is  the  transfer  of  the  inherently  greater  !H  sensi- 
tivity to  the  Z)-coupled  heteronuclei.  The  mechanism  of  polarization  transfer  differs 
between  solution  and  SSNMR:  With  the  exception  of  the  NOESY  experiment, 
polarization  transfer  between  spins  in  solution  is  mediated  by  /-coupling,  whereas 
in  SSNMR,  /^-couplings  are  used. 

The  /^-coupling  is  also  exquisitely  sensitive  to  internuclear  distances  and  orien- 
tations, and  is  used  to  transfer  magnetization  between  spins  in  multidimensional 
SSNMR  experiments.  It  is  clear  that  in  order  to  recover  the  information  content 
inherent  in  /^-coupling,  it  must  be  selectively  reintroduced  into  the  SSNMR  experi- 
ment (as  the  point  of  MAS  combined  with  high-power  lH  decoupling  is  to  get  rid  of 
unwanted  /^-coupling  effects).  A  common  way  to  reintroduce  /^-couplings  is  to  use 
rotor-synchronized  RF  pulses.  If  an  RF  pulse  is  applied  such  that  the  pulse  occurs  at 
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the  same  point  in  the  MAS  rotation  cycle  each  time,  the  motional  averaging  is 
removed  for  a  fraction  of  the  rotation  period,  and  the  /^-coupling  between  spins 
affected  by  the  RF  pulse  is  reinstated  momentarily.  Because  the  RF  pulse  is  always 
applied  at  the  same  point  in  the  rotor  cycle,  the  spins  in  the  sample  are  affected  in 
the  same  way  each  time,  and  an  attenuated  but  non-averaged  /^-coupling  develops 
between  affected  spins.  An  example  of  this  type  of  experiment  is  called  REDOR 
[52],  in  which  the  distance  between  two  heteronuclei  (e.g.,  15N  and  13C)  can  be  mea- 
sured relative  to  a  known  reference  by  the  difference  in  signal  intensity  of  one  of  the 
coupling  partners  as  a  function  of  whether  or  not  rotor- synchronized  pulses  are 
applied  to  the  other  during  the  pulse  sequence. 


5.4.10    Resonance  Assignments  by  Solid- State  NMR 

REDOR  is  a  ID  NMR  sequence  and  is  most  useful  when  both  heteronuclear  sig- 
nals have  already  been  assigned  to  specific  atoms  in  the  structure.  As  with  solution- 
state  NMR,  the  usefulness  of  SSNMR  as  applied  to  biological  macromolecules  is 
exponentially  improved  by  the  availability  of  specific  resonance  assignments.  For 
many  years,  such  assignments  were  limited  to  targets  of  opportunity  (labeled 
cof actors,  unique  residues,  etc.).  However,  a  variety  of  multidimensional  SSNMR 
experiments  developed  over  the  past  10  years  have  made  it  possible  to  make 
sequential  resonance  assignments  for  small  to  midsize  proteins  [53-55].  The  nam- 
ing protocol  for  many  of  these  experiments  is  similar  to  that  used  for  solution  NMR 
of  proteins:  The  spin  types  that  are  correlated  within  the  experiment  give  rise  to  the 
names:  NCO  and  NCA  correlate  amide  15N  resonances  with  those  of  adjacent  car- 
bonyl  (CO)  and  Coc  13C  spins,  respectively,  and  NCOCX  and  NCACX  extend  the 
correlations  to  the  next  carbon  in  the  coupling  path,  similar  to  the  HNCACB  exper- 
iment for  solution- state  NMR.  For  these  types  of  two-dimensional  heteronuclear 
SSNMR  experiments,  the  detected  nucleus  is  13C,  with  cross-polarization  between 
!H  and  15N  in  the  first  step.  Polarization  transfer  between  the  spins  of  interest  in 
many  of  these  experiments  is  achieved  by  Hartmann-Hahn  transfer.  In  order  to 
insure  that  all  spins  of  interest  reach  the  Hartmann-Hahn  condition  at  some  point 
during  the  mixing  period,  amplitude-  or  frequency-modulated  pulses  are  applied  to 
one  of  the  coupled  spins  so  that  a  range  of  frequencies  covering  the  spectral  region 
of  interest  are  present  during  the  mixing  period.  Polarization  transfer  can  also  be 
achieved  in  SSNMR  using  REDOR  sequences  (for  15N,  13C  transfers)  [56]  or  radio 
frequency-driven  recoupling  (RFDR)  [57]. 

Recently,  there  has  been  renewed  interest  in  direct  XH  observation  and  assign- 
ment in  SSNMR  using  a  combination  of  very  high  MAS  spinning  speeds  (28  kHz) 
and  dilution  of  proton  spins.  The  high  spinning  speeds  reduce  !H  line  widths 
sufficiently  to  provide  an  acceptable  degree  of  resolution  in  2D  XH-15N  correla- 
tion spectra,  and  !H  dilution  can  be  obtained  by  expression  of  protein  in  perdeu- 
terated  media,  and  then  allowing  limited  amide  NH  exchange  for  amides.  This 
reduces  ^^H  /^-coupling  sufficiently  for  the  detection  of  resolved  NH 
correlations  [58,  59]. 
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5.4.11    Macromolecular  Structure  Determination  by  NMR 

The  first  use  of  two-dimensional  NMR  to  determine  the  de  novo  structure  of  a  protein 
(bull  semen  protease  inhibitor,  or  BUSI)  by  the  Wuthrich  group  in  1985  was  rightly 
hailed  as  a  milestone  of  structural  biology  [60].  While  small  by  today's  standards  (57 
residues),  the  BUSI  structure  demonstrated  that  NMR  could  stand  alone  as  a  method 
for  determining  biomolecular  structures,  once  the  exclusive  domain  of  X-ray  crystal- 
lography. In  the  intervening  years,  increasing  magnetic  field  strength,  improvements  in 
instrumentation  and  data  handling,  introduction  of  multidimensional  and  multinuclear 
assignment  methods,  and  selective  and  uniform  isotope  labeling,  have  rendered  the 
determination  of  small  soluble  protein  structures  routine.  Once  a  reasonably  complete 
set  of  resonance  assignments  are  in  hand,  the  process  of  structure  determination  can 
begin.  For  compact  soluble  proteins,  with  molecular  weights  up  to  -30  kDa,  the  pro- 
cess is  relatively  straightforward.  NOEs  must  be  identified  and  tabulated,  especially 
those  that  are  diagnostic  of  particular  types  of  secondary  structures.  For  example, 
a-helices  are  characterized  by  strong  sequential  NH-NH  NOEs  and  reasonably  strong 
CocH-NH  and  CaH-CpH  NOEs  between  residues  i  and  i+3  within  the  helix,  while 
antiparallel  P-sheets  are  often  identified  by  inter-strand  CocH-CocH  and  NH-NH  NOEs. 
Parallel  sheets  also  exhibit  characteristic  NOE  patterns,  as  do  p-turns.  An  excellent 
summary  of  NOEs  patterns  typical  of  different  secondary  structural  features  can  be 
found  in  Wuthrich's  classic  text  on  the  subject  [61].  Equally  important  are  NOEs  relat- 
ing disparate  secondary  structural  features  to  determine  a  global  fold.  NOEs  between 
nonsequential  secondary  structures  are  typically  observed  between  side  chain  reso- 
nances, and  require  reasonably  complete  side  chain  assignments.  Dihedral  angle 
restraints  are  another  important  source  of  structural  information.  Coupling  constants 
can  be  measured  and  related  to  dihedral  angles  by  means  of  Karplus-type  relationships, 
which  are  empirically  derived  relations  observed  3-bond  /-couplings  and  the  dihedral 
angle  between  the  coupled  spins.  Furthermore,  XH,  15N,  and  13C  chemical  shifts  of 
identified  resonances  can  be  related  to  backbone  and  side  chain  dihedral  angles  using  a 
variety  of  statistical  methods.  Finally,  RDCs  can  often  relate  the  orientations  of  dispa- 
rate secondary  structural  features  that  are  too  far  apart  to  exhibit  NOEs  between  each 
other.  We  have  recently  described  the  use  of  RDCs  to  characterize  the  solution  confor- 
mations of  cytochrome  P450cam  (46  kDa)  via  a  restrained  "gentle  annealing"  molecu- 
lar dynamics  (MD)  protocol  that  found  families  of  best-fit  conformers  that  the  enzyme 
occupies  in  solution  [62,  63].  A  number  of  other  large  proteins  have  been  successfully 
characterized  using  RDC  restraints  [64,  65]. 


5.4.12    Dealing  with  Larger  Macromolecules 

and  Membrane-Bound  Proteins  by  NMR 

Above  30  kDa,  structure  determination  becomes  more  difficult.  While  signal  to 
noise  and  resolution  issues  have  receded  with  improved  technology,  the  problem  of 
spectral  complexity  and  overlap  makes  the  process  of  resonance  assignment  for 
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larger  soluble  biomacromolecules  by  NMR  less  straightforward.  Furthermore,  T2 
relaxation  rates  increase  with  longer  correlation  times  (slower  molecular  tumbling), 
to  the  point  where  efficient  coherence  transfer  required  for  multidimensional  NMR 
experiments  take  longer  than  the  lifetimes  of  the  coherences  involved.  This  problem 
can  be  ameliorated  to  a  large  extent  by  the  preparation  of  extensively  deuterated 
samples.  By  decreasing  the  !H  density  of  the  sample  via  deuteration,  cross-relaxation 
(T2)  pathways  are  made  less  efficient,  and  many  of  the  standard  triple-resonance 
experiments  for  sequential  backbone  assignments  are  applicable  to  biomolecules 
with  molecular  weights  >50  kDa.  However,  deuteration  comes  at  the  cost  of  reduc- 
ing the  concentration  of  the  most  desirable  nucleus  for  detection,  1H.  In  order  to 
make  use  of  standard  triple-resonance  experiments  with  uniformly  deuterated  sam- 
ples, the  NH  protons  critical  for  these  experiments  must  be  reintroduced  either  by 
unfolding-refolding  (when  possible)  or  by  hydrogen-deuterium  exchange  using 
protonated  buffers.  For  well-folded  proteins  and  those  not  amenable  to  unfolding/ 
refolding,  buffer  exchange  is  often  incomplete,  and  non-exchangeable  and  isolated 
exchangeable  NH  groups  will  be  difficult  to  assign.  The  deuteration  of  side  chains 
also  prevents  the  use  of  HCCH-TOCSY  or  other  ^-^C  correlation  experiments  to 
assign  or  even  detect  side  chain- side  chain  NOEs,  which  are  critical  for  de  novo 
characterization  of  tertiary  structure  in  NOE-based  structure  determinations. 
Selective  reprotonation  schemes  have  been  described  that  allow  limited  reproton- 
ation  of  side  chain  methyl  groups,  which  in  turn  can  be  used  to  detect  nonsequential 
NOEs  for  structural  work. 

Even  in  the  absence  of  sample  deuteration,  it  is  possible  to  extend  many 
^-detected  experiments  for  use  with  molecules  over  30  kDa  by  modifying  the 
detection  portion(s)  of  the  sequence  with  TROSY  selection.  TROSY  (for  /tans- 
verse  relaxation  optimized  spectroscopy)  takes  advantage  of  cross-correlation 
between  the  local  fluctuations  in  magnetic  fields  that  result  in  spin-spin  relaxation 
of  XH-15N  coupled  spin  pairs  [66].  These  local  magnetic  field  fluctuations  have 
both  magnitude  and  sign,  and  can  superimpose  either  constructively  (both  with 
the  same  sign)  or  destructively  (opposite  sign).  If  a  ^-^N  HSQC  experiment  is 
acquired  in  un-decoupled  mode,  one  observes  a  quartet  of  peaks,  split  by  JNli  in 
both  the  lH  and  15N  dimensions  (see  Fig.  5.5).  Of  the  four  peaks,  one  is  consider- 
ably narrower  and  more  intense  than  the  other  three,  indicating  a  destructive  inter- 
ference between  the  fields  that  gives  rise  to  spin-spin  relaxation.  TROSY-based 
experiments  make  use  of  phase  cycling  to  select  for  just  the  narrowest  peak  of  the 
quartet.  This  allows  molecules  that  would  otherwise  give  rise  to  broad  overlapped 
spectra  to  be  characterized  in  detail  by  NMR  methods.  CRINEPT  (cross-corre- 
lated relaxat/on  enhanced  polarization  transfer)  is  a  passive  version  of  this  experi- 
ment, in  which  no  selection  is  performed,  but  the  broader  components  of  the 
multiplets  are  not  observed  simply  because  of  their  large  line  widths  [67]. 
CRINEPT  works  best  for  very  large  molecules  (>60  kDa),  while  TROSY  often 
improves  spectral  appearance  and  interpretability  even  for  relatively  modest-sized 
molecules  (>25  kDa). 

Many  proteins  of  interest  to  the  biophysicist  are  membrane-bound  or  membrane- 
associated.  Even  if  the  protein  itself  is  of  manageable  size  and  appropriately 
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labeled  for  NMR  purposes,  the  presence  of  a  lipid  bilayer  of  large  and/or  indeter- 
minate size  complicates  the  proceedings  considerably.  Membrane  association 
always  increases  NMR  line  widths,  rendering  many  coherence  transfer  experi- 
ments difficult  because  of  short  coherence  lifetimes.  Furthermore,  unless  appropri- 
ately deuterated  lipids  are  used,  unsuppressed  signals  from  alkyl  chains  in  the  lipid 
create  problems  for  observation  of  !H  signals  upfield  from  water  (in  the  alkyl 
region  of  the  !H  spectrum).  Often,  structural  determinations  of  membrane-bound 
species  are  best  approached  by  using  a  combination  of  SSNMR  and  solution  meth- 
ods [68].  Given  that  MAS  SSNMR  yields  the  same  chemical  shifts  as  solution 
methods  (assuming  no  change  in  environment),  a  useful  approach  is  to  prepare 
samples  in  micelles  or  bicelles  of  relatively  defined  size  that  are  amenable  to  solu- 
tion NMR  assignment  methods,  and  use  these  assignments  in  the  analysis  of 
SSNMR  data,  from  which  structural  information  can  be  obtained.  A  reasonable 
number  of  membrane  protein  structures  have  been  determined  by  NMR  to  date, 
often  using  a  combination  of  methods  [13,  69,  70]. 


5.5    Dynamic  Information  Available  from  NMR 


While  NMR  is  an  important  structural  tool,  the  advent  of  high-throughput  struc- 
tural genomics  projects,  with  the  automation  of  crystallization  trials,  wide  access 
to  beam  lines  and  user-friendly  diffraction  analysis  software  means  that,  with  the 
exception  of  small  proteins,  polypeptides,  and  polynucleotides  that  resist  crystal- 
lization, initial  structure  determinations  of  biological  macromolecules  will  most 
often  be  done  using  X-ray  diffraction.  Where  NMR  comes  into  its  own  is  in  char- 
acterizing the  solution  dynamics  of  macromolecules.  Crystallization  usually  cap- 
tures only  one  major  conformation,  although  some  regions  of  the  molecule  may 
exhibit  polymorphism  in  different  members  of  the  asymmetric  unit,  or  show  mul- 
tiple conformations  locally.  Often,  a  significant  problem  with  a  crystallographic 
structure  is  determining  what  relationship  it  has  (if  any)  with  the  biological  func- 
tion of  the  macromolecule:  It  may  be  that  the  conformer  that  crystallizes  is  consid- 
erably different  from  the  preferred  ensemble  of  conformers  in  solution  or  the 
biologically  functional  form.  A  recent  example  of  this  is  the  case  of  IscU,  a  scaf- 
fold protein  from  E.  coli  that  aids  in  the  construction  of  iron- sulfur  clusters  in  vivo. 
The  crystallographic  structure  of  IscU  in  complex  with  desulfurase  IscS  shows  a 
well-folded  IscU  [71],  while  NMR  data  suggest  that  IscU  is  mostly  unfolded  in  the 
solution  complex  with  IscS  [72].  Furthermore,  it  has  become  apparent  that  in  vivo 
many  proteins  sample  multiple  conformations  and  oligomerization  states,  and 
NMR  can  provide  insight  at  atomic  resolution  in  such  cases  [11,  73].  The  value  of 
NMR  in  these  cases  comes  from  its  ability  to  characterize  macromolecular  dynam- 
ics on  a  wide  range  of  timescales  (Fig.  5.16)  [74].  We  will  describe  some  of  the 
methods  used  to  extract  dynamic  information  in  approximate  order  of  the  decreas- 
ing timescales  that  they  represent. 
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Dynamic  processes 
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Fig.  5.16  Approximate  time  scales  of  macromolecular  motions  and  appropriate  NMR  methods 
that  access  those  motions 


5.5.1    Diffusion  Measurements 

In  a  typical  NMR  experiment,  considerable  effort  is  expended  in  making  sure  that 
the  magnetic  field  is  homogeneous  throughout  the  sample  volume,  to  ensure  that 
individual  resonance  lines  are  as  narrow  and  symmetric  as  possible.  However,  it  is 
possible,  through  the  application  of  linear  magnetic  field  gradients,  to  extract  posi- 
tion-specific data  for  a  small  volume  of  a  sample.  Indeed,  this  is  the  basis  of  mag- 
netic resonance  imaging;  the  appropriate  application  of  PFGs  allows  3D  images  of 
nonhomogenous  samples  to  be  constructed.  In  molecular  spectroscopy,  PFGs  can 
be  used  to  encode  positions  of  a  given  molecule  within  the  sample.  This  positional 
information  can  then  be  decoded  at  a  later  time  by  application  of  a  second  PFG.  In 
the  time  between  encoding  and  decoding,  the  molecule  will  diffuse  out  of  the  vol- 
ume in  which  it  was  encoded.  Depending  upon  how  far  the  molecule  diffuses,  the 
decoding  by  the  second  PFG  will  be  more  or  less  complete.  If  a  molecule  does  not 
move  far  from  the  position  it  occupied  during  the  encoding  PFG,  it  will  decode 
more  completely  (and  give  rise  to  a  more  intense  spectral  peak)  than  those  mole- 
cules that  diffuse  farther  from  their  starting  position.  From  these  data,  a  diffusion 
coefficient  for  a  given  species  can  be  extracted.  Besides  providing  information 
about  molecular  size,  shape,  and  degree  of  oligomerization,  diffusion  measure- 
ments can  be  used  to  screen  small  molecules  that  bind  to  macromolecules.  Because 
macromolecules  diffuse  more  slowly,  they  tend  to  lose  less  signal  intensity  during 
the  diffusion  delay  between  the  encoding  and  decoding  PFGs  than  their  smaller 
counterparts.  However,  if  a  small  molecule  binds  preferentially  to  a  macromolecule, 
its  diffusion  will  be  slowed  relative  to  other  small  molecules  in  the  mixture,  and  its 
signals  will  be  correspondingly  less  attenuated  upon  decoding.  Because  the  target 
macromolecule  does  not  have  to  be  characterized  (or  indeed  even  be  in  high  enough 
concentration  to  be  observed  in  the  experiment),  diffusion-ordered  spectroscopy 
(DOS  Y)  provides  a  convenient  rapid  screen  of  small  molecule  mixtures  for  binding 
to  a  macromolecular  target  [75]. 
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5.5.2    Hydrogen-Deuterium  {HID)  Exchange 

The  slowest  internal  dynamic  process  that  can  be  measured  by  solution  NMR 
(10~7->  10~4  s-1)  is  the  chemical  exchange  of  internal  amide  (or  imino)  protons  for 
deuterium.  Experimentally,  these  measurements  are  straightforward:  The  macro- 
molecule  is  prepared  fully  protonated  with  15N  labeling,  rapidly  exchanged  into  a 
perdeuterated  buffer,  and  the  loss  of  signal  intensity  for  individual  exchangeable 
protons  is  followed  by  a  convenient  experiment  such  as  HSQC.  While  many 

surface-exposed  protons  will  exchange  before  the  first  HSQC  experiment  is  started, 
internal  protons,  particularly  those  involved  in  hydrogen  bonding  within  regular 
secondary  structures,  will  exchange  slowly  enough  that  the  time  course  of  exchange 
can  be  followed  and  exchange  constants  calculated.  H/D  exchange  rates  have  been 
modeled  as  a  measure  of  local  unfolding  and  de-protection  of  particular  protons, 
and  provides  insight  into  the  likelihood  that  a  particular  region  of  the  macromole- 
cule  is  solvent-accessible  on  a  given  time  scale.  Although  few  biologically  impor- 
tant processes  happen  on  the  time  scale  of  H/D  exchange,  this  method  can  be  used 
to  identify  more  mobile  parts  of  the  protein  that  are  likely  to  be  dynamic  on  func- 
tionally relevant  time  scales.  Because  H/D  exchange  rates  are  sensitive  to  pH,  it  is 
important  to  monitor  this  variable  in  the  course  of  exchange  experiments. 


5.5.3    Chemical  Shift  Perturbation  and  Line  Width  Changes 

Because  chemical  shifts  are  sensitive  to  environmental  factors  (hydrogen  bonding, 
solvation,  steric  interactions,  etc.),  local  environmental  perturbations  can  result  in 
chemical  shift  changes  of  nearby  spins.  The  extent  of  these  shifts  contains  dynamic 
information  resulting  from  the  chemical  shift  time  scale,  which  extends  from  -101 
to  104  Hz.  In  the  case  of  a  spin  exchanging  between  two  environments,  A  and  B,  the 
extent  of  the  perturbation  (in  Hz,  or  s-1)  relative  to  the  exchange  rate  (kex,  also  in  s_1) 
determines  how  the  phenomenon  affects  the  appearance  of  the  spectrum.  If  the  rate 
of  exchange  between  sites  is  less  than  the  difference  in  chemical  shifts  |vA-vB|  for 
the  spin  at  the  two  sites,  there  will  be  two  peaks  in  the  spectrum,  one  at  vA  and  one 
at  vB,  with  integrations  proportional  to  the  relative  populations  of  the  sites.  The 
exchange  in  this  case  is  referred  to  as  being  slow  on  the  chemical  shift  time  scale. 
However,  as  the  exchange  rate  increases,  the  line  widths  of  the  two  peaks  also 
increase,  reflecting  the  shortened  lifetimes  of  the  spin  at  each  of  the  two  sites.  When 
the  rate  of  exchange  equals  the  difference  in  chemical  shifts  of  the  two  sites,  the  two 
peaks  coalesce  into  a  single  very  broad  peak  at  the  weighted  average  shift 
K)bs  =  XA^A-(l  ~Xa)^b,  here  Xa  is  the  fractional  population  of  site  A.  Increasing  the 
exchange  rate  even  further  results  in  a  narrowing  of  the  single  peak,  until  at  very  fast 
exchange,  the  single  line  exhibits  a  line  width  similar  to  those  of  other,  non- 
exchanging  peaks  assigned  to  the  molecule.  Such  exchange  is  fast  on  the  chemical 
shift  time  scale  (Fig.  5.17).  For  two-site  exchange,  the  broadening  that  occurs  as  the 
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Fig.  5.17  Spectral  appearance  as  a  function  exchange  rate  (kex)  of  chemical  exchange  between  two 
equally  populated  sites.  The  two  sites  differ  in  chemical  shift  by  40  Hz.  The  separation  in  chemical 
shift  affects  the  transition  between  slow  and  fast  exchange:  The  greater  the  separation,  the  faster 
the  exchange  must  be  to  cause  the  two  peaks  to  coalesce.  Note  that  the  integration  of  the  peak  areas 
does  not  change  (see  amplitude  scales) 


exchange  rate  slows  towards  the  slow  exchange  limit  provides  information  about  the 
rate  of  the  dynamic  process  affecting  the  line  width  according  to  (5.7): 


(5.7) 


where  Av1/2  is  the  line  width  in  Hz  at  half  height.  As  such,  comparison  of  line  widths 
as  a  function  of  experimental  variables,  as  in  titration  of  an  enzyme  by  a  substrate 
or  cof actor,  provides  direct  information  about  the  rate  of  exchange  processes  near 
the  slow-fast  exchange  limit. 
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5. 5. 4   EXS  Y  and  zz-Exchange  Spectroscopy 

Most  internal  macromolecular  dynamics  are  fast  on  the  chemical  shift  timescale, 
with  a  few  notable  exceptions.  X-proline  cis-trans  amide  isomerizations  are  usually 
slow  (<0.06  s_1)>  but  have  been  found  to  accompany  functionally  important  confor- 
mational changes.  There  is  increasing  evidence  that  X-Pro  isomerizations  provide  a 
general  mechanism  for  biological  switches  between  active  and  inactive  conforma- 
tions of  proteins  and  enzymes  [76] .  Aromatic  ring  flips  (resulting  in  the  exchange  of 
nominally  equivalent  atoms  on  either  side  of  a  tyrosine  or  phenylalanine  ring)  are 
occasionally  observed  to  be  slow  on  the  chemical  shift  time  scale,  and  provide 
information  on  slow  dynamics  of  protein  interiors.  An  intriguing  example  of  this  is 
heme  reorientation  in  cytochrome  b5.  Despite  ligation  by  two  axial  histidine  ligands, 
the  heme  porphyrin  exhibits  exchange  between  two  orientations  related  by  rotation 
around  a  pseudosymmetry  axis  [77].  On  the  other  hand,  many  important  bimolecu- 
lar  events,  such  as  substrate  and  cofactor  binding,  are  often  slow  on  the  chemical 
shift  time  scale  [78].  Rate  constants  for  such  phenomena  can  often  be  extracted  by 
magnetization  transfer  between  sites,  using  experiments  such  as  EXSY  (exchange 
spectroscopy)  and  zz-exchange  HSQC,  in  which  exchange-related  peaks  in  these 
two-dimensional  experiments  are  correlated  by  cross-peaks,  the  intensity  of  which 
varies  with  the  mixing  time  xm  during  which  the  exchange  takes  place  [79].  By  relat- 
ing the  cross-peak  volumes  with  xm,  rate  constants  for  the  exchange  processes  can 
be  calculated.  It  should  be  noted  that  Tx  relaxation  competes  with  exchange  transfer, 
and  if  the  exchange  rate  is  much  lower  than  the  spin-lattice  relaxation  rate  constant, 
the  EXSY  experiment  will  not  report  on  the  exchange. 


5.5.5    Heteronuclear  Relaxation 

In  the  section  above  dealing  with  nuclear  spin  relaxation,  we  noted  in  passing  that 
spin  relaxation  is  induced  by  local  random  electromagnetic  fluctuations  at  the  fre- 
quency corresponding  to  that  of  the  spectroscopic  transition  involved.  It  is  now  time 
to  consider  this  in  greater  detail.  In  a  ^-^N  bonded  pair,  there  are  four  dipolar 
relaxation  pathways  that  require  local  electromagnetic  fluctuations  at  the  appropri- 
ate frequencies.  These  include  a  single  15N  spin  state  change  (Larmor  frequency  ooN), 
a  single  XH  spin  state  change  (ooH),  a  two-quantum  (goh  +  g)n)  and  a  zero-quantum 
(coH— coN)  transition  (co  =  2ji;v).  The  relative  probabilities  of  any  of  these  transitions 
occurring  are  directly  related  to  the  motional  behavior  of  the  molecule  to  which  the 
spin  pair  belongs.  The  random  motion  of  molecules  in  a  magnetic  field  produce  an 
electromagnetic  "white  noise"  that  can  cause  the  transitions  leading  to  relaxation. 
However,  that  "noise"  must  contain  frequency  elements  corresponding  to  the  transi- 
tion of  interest  in  order  for  that  transition  to  occur.  The  frequency  range  over  which 
electromagnetic  noise  is  nonzero  is  described  by  the  spectral  density  (Fig.  5.18).  For 
small  molecules  tumbling  rapidly  in  a  nonviscous  solvents,  the  spectral  density  is 
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Fig.  5.18  Spectral  density, 
/(go),  plotted  for  two 
correlation  times,  xc=  10"11  s 
and  10-12  s.  Because  curves 
are  plotted  on  a  logarithmic 
scale,  the  areas  under  the 
curves  are  not  equal. 
However,  the  available  power 
is  constant  at  a  constant 
temperature.  The  efficiency 
of  any  relaxation  process 
connecting  spin  states  with  a 
transition  frequency  go  is 
proportional  to  the  spectral 
density  /(go)  at  that  frequency 


Xr=  10-H  S 


=  10-12  S 


log  co 
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nonzero  over  a  wide  range  of  frequencies,  so  that  all  of  the  relaxation  pathways  of 
interest  are  available.  However,  large  molecules  tumble  slowly,  resulting  in  spectral 
densities  that  are  poor  in  high-frequency  elements,  and  relaxation  reflects  this  by 
mostly  taking  place  via  lower  frequency  pathways  (zero-  and  single-quantum  transi- 
tions). Because  spin-lattice  (T{)  relaxation  requires  that  energy  be  removed  from  the 
spin  system,  not  just  moved  from  one  spin  to  another,  the  lack  of  a  two-quantum 
transition  results  in  Tx  processes  being  less  significant  relative  to  T2  (spin-spin) 
relaxation  in  large  molecules  without  major  internal  motions. 

Experiments  have  been  designed  to  measure  heteronuclear  (X)  Tu  T2,  and  X, { 1H} 
NOE  in  macromolecules  that  enable  numerical  values  for  the  correlation  time,  tc,  to 
be  calculated,  and  a  measure  of  the  random  motions  of  the  molecules  involved  to  be 
obtained  [80] .  The  correlation  time  can  be  thought  of  as  the  mean  time  required  for 
the  molecule  to  reorient  by  1  rad  in  any  direction,  and  is  the  reciprocal  of  the  high- 
frequency  cut-off  for  the  spectral  density.  Above  oo=tc_1,  the  spectral  density 
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decreases  rapidly  to  zero,  militating  against  any  relaxation  pathways  requiring 
frequencies  greater  than  xc_1.  Note  that  for  many  macromolecules,  this  cut-off  fre- 
quency is  ~108  s-1,  well  below  the  frequency  required  to  stimulate  double-quantum 
transitions  at  high  magnetic  field.  However,  if  there  are  local  motions  within  the 
macromolecule,  these  will  be  reflected  in  an  increase  in  higher  frequency  fluctua- 
tions, and  relaxation  in  the  affected  regions  will  reflect  this  increased  local  motion. 
Relaxation  of  15N  in  proteins  is  often  interpreted  semi-quantitatively  as  the  ratio 
between  15N  Tx  and  T2  graphed  as  a  function  of  sequence,  since  this  ratio  reflects  the 
relative  contributions  of  possible  relaxation  pathways.  The  ratio  tends  to  remain 
near  a  constant  value  for  regions  of  the  biomolecule  that  do  not  exhibit  a  high  degree 
of  conformational  flexibility,  but  shows  deviations  for  regions  of  greater  mobility. 

The  simplest  method  for  extracting  numerical  dynamic  information  from  hetero- 
nuclear  relaxation  is  the  model-free  approach  [81],  which  is  based  on  the  assump- 
tion that  motional  behavior  can  be  represented  by  two  correlation  times,  an  overall 
molecular  correlation  time  xm  and  a  local  effective  correlation  time  Te.  For  the 
model-free  approach  to  be  effective,  the  overall  correlation  time  xm  must  be  much 
longer  than  xe.  Fitting  of  experimental  Tu  T2,  and  heteronuclear  NOE  yields  values 
of  xm,  xe,  and  an  order  parameter  (S2),  which  is  a  measure  of  the  degree  of  freedom 
that  a  local  structural  feature  exhibits  with  respect  to  the  overall  tumbling  of  the 
macromolecule.  An  order  parameter  S2=  1  indicates  that  there  is  no  motion  of  the 
local  structure  independent  of  the  overall  motion  of  the  molecule,  while  S2  =  0  indi- 
cates that  the  local  structure  moves  independently  of  the  molecule  as  a  whole. 

The  frequencies  of  motions  accessible  to  standard  relaxation  experiments  are 
quite  high  (Fig.  5.16),  and  although  such  motions  often  correlate  with  longer  times- 
cale  dynamics,  they  are  much  faster  than  are  typically  expected  for  functionally 
important  motions,  such  as  those  involved  in  signal  transduction  or  enzyme  activity. 
For  this  reason,  experiments  have  been  developed  to  measure  field-dependent  T2 
relaxation,  which  allows  the  detection  of  lower  frequency  components  of  spectral 
densities.  The  standard  pulse  sequence  for  measurement  of  heteronuclear  T2  is 
called  the  Carr-Purcell-Meiboom-Gill  experiment  (CPMG)  [82].  The  power  of  the 
RF  pulses  used  to  generate  and  invert  transverse  relaxation  in  the  CPMG  experi- 
ment is  related  to  how  fast  the  net  magnetization  is  rotated  around  the  applied  RF, 
and  power  levels  for  these  pulse  trains  are  usually  described  in  terms  of  the  fre- 
quency with  which  the  magnetization  rotates  around  a  transverse  axis.  For  example, 
a  200  Hz  CPMG  field  rotates  bulk  magnetization  around  a  transverse  (that  is,  x  or  y) 
axis  200  times  per  second,  corresponding  to  a  90°  pulse  length  of 
(0.25  x  1/200  s_1)  =  1-25  ms.  If  an  exchange  process  moves  a  spin  between  magneti- 
cally nonequivalent  sites  on  a  time  scale  similar  to  that  of  the  CPMG  field  rotational 
frequency,  the  exchange  process  has  a  measurable  effect  on  T2.  The  exchange  rate 
for  that  process,  &ex,  can  be  extracted  from  the  measured  value  of  T2  as  a  function  of 
changing  the  CPMG  field  strength  (dispersion  curve).  Resonances  that  are  unaf- 
fected by  the  exchange  process  exhibit  T2  values  that  are  invariant  relative  to  the 
CPMG  field  strength.  Even  in  the  absence  of  stringent  quantitative  interpretation, 
this  series  of  experiments  can  be  used  to  identify  resonances  that  are  undergoing 
exchange.  In  many  cases,  this  method  enables  identification  of  the  exchanging 
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conformers.  Structural  information  about  the  conformers  is  obtained  from  the 
chemical  shift  differences  that  are  extracted  from  the  dispersion  curves. 

A  related  measurement  is  that  of  T^p,  the  relaxation  time  of  a  resonance  that  is 
spin-locked  around  the  effective  field  #eff,  the  vector  sum  of  the  chemical  shift  offset 
and  the  RF  field  Bu  as  in  TOCSY  and  ROESY  [83].  In  the  spin-locking  frame  of 
reference,  spins  precess  ("nutate")  around  BeS  at  a  frequency  v  determined  by  the 
expression  2%v  =  (x)  =  yBeS.  As  Bef£  is  many  orders  of  magnitude  lower  than  the  BQ 
field  of  the  magnet  in  which  the  experiment  is  performed,  nutation  frequencies  in 
the  spin-lock  frame  are  also  much  lower  than  the  Larmor  frequency.  As  such,  the 
spectral  density  components  needed  for  relaxation  are  provided  by  low  frequency 
motions,  and  the  frequencies  of  these  motions  can  be  extracted  from  measurement 
of      as  a  function  of  BeS. 


5.5. 6    Solid-State  2H  Lineshape  Analysis 

Macromolecular  motions  on  time  scales  between  the  ns  and  ms  range  are  difficult 
to  probe  by  solution  NMR.  However,  the  nuclear  quadrupole  of  2H  interacts  with 
electric  field  gradients  with  couplings  on  the  order  of  105  Hz,  and  motions  in  this 
range  (|is  time  scale)  will  affect  the  observed  2H  line  shapes  in  solid-state  NMR 
spectra.  Motional  parameters  can  be  varied  to  get  the  best  fit  the  predicted  and 
observed  2H  line  shapes  in  solid-state  spectra,  yielding  information  about  local 
motions  on  the  |is  timescale. 

5.6  Instrumentation 

Macromolecular  NMR  spectroscopy  is  today  almost  exclusively  performed  using  stable 
superconducting  magnets  to  provide  the  primary  magnetic  field.  As  noted  in  Sect.  5.1, 
signal  to  noise  ratio  scales  with  field  strength  as  B3/2,  and  spectral  dispersion  scales  lin- 
early with  field,  so  reaching  higher  fields  has  been  a  primary  focus  of  the  industry  for 
many  years.  Often,  instruments  available  in-house  at  nonspecialized  sites  do  not  exceed 
14.1  T  (600  MHz  XH),  and  while  much  worthwhile  biophysical  research  has  been 
accomplished  using  500  and  600  MHz  instruments,  for  larger  macromolecules 
(>20  kDa),  there  is  significant  benefit  to  moving  to  higher  fields.  Currently,  the  highest 
field  magnet  in  operation  for  spectroscopy  is  23.5  T  (1  GHz  !H),  and  18.8  T  (800  MHz 
!H)  and  21.1  T  (900  MHz)  magnets  are  not  uncommon.  Unfortunately,  the  cost  of  high- 
field  spectrometers  is  a  significant  drawback.  There  was  a  rule  of  thumb  for  many  years 
that  NMR  spectrometers  cost  "$1K  per  MHz."  However,  above  600  MHz,  this  rule 
broke  down  long  ago,  and  prices  increase  nonlinearly  for  fields  above  600  MHz.  Thus  it 
is  often  necessary  to  concentrate  high-field  NMR  spectrometers  in  central  locations, 
with  a  staff  of  experts  and  engineers  to  help  maintain  the  instrumentation  and  train/assist 
outside  users.  This  is  true  both  for  academic  and  industrial  research  institutions. 
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First-time  users  and  nonexperts  in  NMR  spectroscopy  can  usually  find  training  and 
assistance  in  acquiring  and  analyzing  their  data  at  such  facilities,  which  can  also  provide 
guidance  about  the  feasibility  and  cost  of  a  particular  project.  A  quick  on-line  search 
yields  many  possibilities  for  collaborative  NMR  research. 


5. 6. 1    Solution-State  NMR  Probes 

At  least  as  important  as  the  magnet  is  the  probe  used  for  a  particular  experiment. 
The  standard  probe  for  biomolecular  solution- state  NMR  as  applied  to  proteins  is  a 
called  a  "triple-resonance"  or  "inverse  detection"  probe,  and  consists  of  an  inner 
coil  (closest  to  the  sample,  and  enclosing  the  sample  volume  as  completely  as  pos- 
sible) for  excitation  and  detection  of  lH  and  2H,  and  an  outer  coil  tuned  for  13C  and 
15N  excitation  and  decoupling.  2H  capability  is  primarily  used  for  locking  the  spec- 
trometer frequencies  relative  to  signals  in  the  sample  and  for  2H  decoupling.  The 
related  "quadruple-resonance"  probe  includes  31P  excitation/decoupling  capability 
for  nucleic  acid  work.  On  many  modern  spectrometers,  the  transceiver  coils  and 
preamplifier  (which  boosts  the  very  weak  detected  signal  prior  to  transmission  to 
the  spectrometer  console)  are  cooled  to  cryogenic  temperatures  (-32  K)  by  a  flow 
of  cold  helium  gas.  This  results  in  a  significant  reduction  in  thermal  noise  within  the 
detection  and  preamplifier  circuits  and  corresponding  improvement  in  signal  to 
noise  (by  a  factor  of  4  under  ideal  conditions).  Cryogenically  cooled  probes  are 
particularly  useful  for  dilute  (<1  mM)  samples  of  biological  macromolecules,  and 
provide  the  greatest  sensitivity  improvement  when  used  with  low-salt  buffers. 

While  direct  detection  of  13C,  31P,  and  15N  is  possible  using  the  outer  coil  on 
inverse-detection  probes,  the  signal  to  noise  ratio  suffers  significantly  due  to  the 
lack  of  proximity  and  poorer  filling  of  the  coil-enclosed  sensitive  volume  by  the 
sample.  For  large  perdeuterated  proteins  and  some  paramagnetic  species,  it  is  often 
desirable  to  directly  observe  13C,  because  of  its  favorable  relaxation  characteristics 
and,  in  many  cases,  simplified  pulse  sequences.  For  such  experiments,  inner  coils 
that  are  doubly  tuned  for  lH  and  13C  are  available  on  specialized  probes. 

Almost  all  modern  NMR  probes  are  equipped  with  coils  for  applying  PFGs  in 
the  course  of  an  experiment.  The  coils  are  referred  to  by  the  axis  in  the  laboratory 
frame  along  which  the  gradient  is  applied  (x,  y,  and  z).  Single-axis  gradients  are 
usually  applied  along  z  (the  long  axis  of  the  sample),  and  are  sufficient  for  most 
applications. 


5. 6.2    Solid-State  NMR  Probes 

The  requirements  for  MAS  and,  in  many  cases,  pulse-rotor  synchronization  for 
SSNMR  experiments  mean  that  these  probes  are  very  different  in  design  from 
solution- state  probes.  Power  requirements  for  cross-polarization,  high-power  !H 


154 


T.C.  Pochapsky  and  S.S.  Pochapsky 


decoupling,  sample  alignment  at  the  magic  angle  and  high  rotation  speeds  demand 
rugged  construction  and  more  mechanical  parts  than  for  solution  work.  Until  rela- 
tively recently,  the  magnets  used  for  SSNMR  required  wider  bores  than  was  typi- 
cally used  for  solution  work,  so  the  same  magnets  would  not  normally  be  used  for 
both.  However,  it  is  now  possible  to  purchase  SSNMR  probes  that  fit  into  a  stan- 
dard-bore magnet,  so  the  same  magnet  can  be  used  for  both  solution  and  SSNMR. 


5. 6. 3  Electronics 

Typically,  a  separate  amplifier/transmitter  (termed  a  "channel")  is  required  for  each 
nucleus  involved  an  NMR  experiment.  For  the  detected  nucleus  and  lock  nucleus 
channels,  receivers,  and  mixer  circuits  as  well  as  analog-to-digital  converters 
(ADCs)  are  also  required.  Each  channel  must  be  coordinated  with  the  others  using 
timing  circuits  associated  with  a  digital  frequency  synthesizer  to  insure  phase  coher- 
ence and  appropriate  timing  of  each  pulse  in  the  NMR  experiment.  Dedicated  XH 
amplifiers  for  solution  NMR  are  typically  lower  power  (50  W)  than  those  used  for 
other  nuclei,  as  the  spectral  range  covered  by  XH  is  relatively  small.  Broadband 
amplifiers  used  for  other  nuclei  with  wider  chemical  shift  ranges  are  usually  at  least 
200  W,  and  a  separate  amplifier  is  usually  used  for  each  channel.  Amplifiers  used 
for  2H  lock  and  decoupling  are  low  power  due  to  the  narrow  spectral  range  of  2H. 
Amplifiers  used  for  SSNMR  are  typically  more  powerful  than  those  required  for 
solution  NMR,  because  power  requirements  for  !H  decoupling  in  CP-MAS  are 
determined  not  by  the  chemical  shift  range  of  1H,  but  the  magnitude  of  dipolar  cou- 
plings, which  can  be  in  the  10s  of  kHz. 


5.7    Experimental  Requirements 

5. 7.1    Sample  Requirements  for  Solution  NMR 

The  primary  requirement  for  solution  NMR  characterization  of  a  biomolecule  is, 
obviously,  solubility.  However,  the  definition  of  "soluble"  has  changed  dramatically 
since  the  inception  of  biomolecular  NMR,  and  samples  with  concentrations  as  low 
as  100  uM  are  often  adequate  for  structural  determinations  and  dynamics  measure- 
ments with  appropriate  instrumentation.  "Solubility  with  assistance,"  e.g.,  by  intro- 
duction of  lipids,  detergents/surfactants  or  stabilizing  agents  such  as  glycerol  and 
high  salt,  is  often  required  for  membrane-bound  and  membrane-associated  proteins, 
or  proteins  that  aggregate  or  oligomerize  [68,  70,  84].  One  intriguing  alternative  for 
solubilization  of  membrane  proteins  is  the  "nanodisc,"  a  lipid  bilayer  discoid  with 
dimensions  defined  by  the  number  of  turns  of  oc-lipoprotein  used  to  generate  it  [85]. 
Nanodiscs  can  be  prepared  with  a  single  membrane-bound  protein  associated  with 
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each  discoid,  reducing  the  problems  of  self-association  and  heterogeneity  often 
observed  with  standard  membrane  protein  solubilization  techniques.  Still,  which- 
ever method  is  used,  assisted  solubilization  often  results  in  complications  such  as 
increased  sample  viscosity  (which  broadens  lines),  large  residual  signals  due  to 
additives  and  buffers,  and  "lossy"  samples,  that  require  longer  RF  pulses  to  achieve 
maximum  excitation  and  can  result  in  sample  heating.  While  many  commonly  used 
surfactants,  detergents,  and  buffer  additives  are  available  in  deuterated  form  (reduc- 
ing otherwise  intense  resonances),  not  all  are,  and  these  can  give  rise  to  very  large 
signals  in  important  regions  of  the  spectrum.  However,  the  use  of  deuterated  addi- 
tives can  complicate  sample  locking.  The  2H  lock  signal  (usually  the  HDO  line  from 
~5  %  added  D20  in  aqueous  samples)  is  used  to  adjust  the  spectrometer  frequency 
in  response  to  small  magnetic  field  fluctuations  in  the  course  of  a  long  acquisition. 
The  presence  of  other  strong  2H  signals  from  additives  can  render  the  lock  less  sta- 
ble or  allow  the  lock  to  "jump"  from  one  signal  to  another  during  the  experiment. 

Another  critical  requirement  is  sample  stability.  As  will  be  seen,  a  comprehensive 
set  of  data  acquired  for  complete  characterization  of  a  biomacromolecule  can  take 
up  to  2  weeks  on  the  spectrometer.  One  of  the  most  common  causes  of  decomposi- 
tion is  the  presence  of  trace  amounts  of  protease  (for  proteins)  and  nucleases  (for 
nucleic  acids).  The  addition  of  small  amounts  of  appropriate  inhibitors  can  help  in 
these  cases.  Sterile  handling  techniques,  deoxygenation  and  the  addition  of  a  small 
amount  of  sodium  azide  to  NMR  buffers  will  inhibit  the  growth  of  microorganisms 
in  NMR  samples. 


5. 7.2    Isotope  Labeling 

The  most  common  sample  modification  required  for  NMR  is  the  introduction  of 
isotope  labels.  It  is  no  accident  that  biomolecular  NMR  came  into  its  own  as  an 
independent  structural  methodology  as  methods  for  protein  over-expression  in  bac- 
terial hosts  became  available,  allowing  both  selective  and  uniform  isotope  labeling 
to  be  performed.  Although  many  types  of  samples  (peptides,  small  proteins,  and 
oligonucleotides)  are  still  usefully  characterized  by  homonuclear  (1H)  correlation 
experiments  such  as  COSY,  DQF-COSY,  NOESY,  and  TOCSY,  for  larger  proteins, 
biomolecular  NMR  is  best  performed  using  15N,  13C  labeled  samples.  This  labeling 
scheme  allows  sequential  assignments  to  proceed  via  standard  three-dimensional 
heteronuclear  correlation  experiments  without  resorting  to  sequential  NOEs,  which 
can  be  ambiguous,  especially  in  crowded  spectra.  Virtually  any  recombinant  protein 
or  synthetic  polynucleotide  can  be  isotopically  labeled,  although  protein  yields 
from  in  vivo  expression  systems  (usually  bacterial)  often  decrease  significantly 
when  using  defined  minimal  media  commonly  used  for  expression  of  isotopically 
labeled  samples.  Alternatively,  good  yields  of  isotope-labeled  proteins  have  been 
obtained  using  labeled  algal  extracts  as  bacterial  growth  media,  expression  in  yeast 
[86],  insect  cells  [87]  or  by  in  vitro  expression  using  wheat-germ  extracts  for  trans- 
lational  machinery  [88,  89].  Eukaryotic  proteins  are  sometimes  not  expressed  well 
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in  bacterial  expression  systems,  particularly  if  disulfide  bond  formation  or  glycosyl- 
ation  is  required  for  correct  folding,  but  success  has  been  had  expressing  protein  in 
eukaryotic  (yeast,  insect,  and  mammalian)  expression  systems  [90].  Obtaining  a 
usable  amount  of  labeled  sample  is  sometimes  the  rate-limiting  step  in  the  NMR 
characterization  of  a  macromolecule,  so  it  is  usually  worth  testing  a  number  of 
expression  systems  and  methods  in  unlabeled  media  before  committing  to  the 
expense  of  preparing  a  labeled  sample. 

The  best  labeling  scheme  depends  upon  the  experiments  to  be  performed  and, 
generally,  the  size  of  the  molecule  under  investigation.  For  sequential  assignments 
and  structural  determination  of  proteins  less  than  25  kDa  in  molecular  mass,  uni- 
form 13C  and  15N  labeling  is  generally  sufficient  for  all  of  the  needed  experiments 
(see  below).  Above  25  kDa,  the  situation  can  become  more  complicated.  Slower 
tumbling  rates  make  ^-mediated  13C  T2  relaxation  more  efficient,  and  13C  coher- 
ences become  shorter-lived,  to  the  point  where  the  standard  3D  triple-resonance 
experiments  begin  to  fail.  In  such  cases,  fractional  or  complete  deuteration  of  the 
macromolecule  is  an  option.  Replacing  protons  bonded  to  13C  with  2H  greatly 
increases  the  lifetimes  of  13C-based  coherences,  and  make  it  possible  for  standard 
sequential  assignment  experiments  to  be  performed  with  much  larger  macromolecules. 
We  routinely  make  sequential  assignments  on  molecules  >46  kDa  using  samples 
perdeuterated  to  suppress  ^-mediated  T2  relaxation  [91],  and  assignments  for 
malate  synthase  G  (81.4  kDa  monomer)  have  been  reported  [64].  Growth  media 
for  expression  of  perdeuterated  and  13C,15N-labeled  samples  are  prepared  from 
commercially  available  13C,  2H-labeled  carbon  sources  such  as  glycerol  and  glu- 
cose, with  H20  replaced  in  the  medium  by  D20.  Care  must  be  taken  to  insure  other 
nonobvious  sources  of  !H  (e.g.,  hydrated  salts  added  to  the  growth  medium  and 
additives  such  as  vitamins,  antibiotics,  and  inducers)  are  identified  and  suppressed. 
If  care  is  taken  in  protecting  spent  D20  growth  media  from  moisture  after  cell  har- 
vest, it  can  be  recycled  by  distillation  from  a  strong  aprotic  base  (MgO  or  CaO)  and 
filtration  of  the  distillate  through  activated  charcoal  to  remove  volatile  amines  and 
thiols.  Densitometry  provides  a  rapid  assay  for  determining  the  %  D20  present  in 
the  recycled  material. 

The  absence  of  XH  spins  in  perdeuterated  samples  has  drawbacks.  Most  current 
sequential  assignment  experimental  schemes  require  !H-15N  pairs  for  detection,  so 
the  samples  as  isolated  are  not  suitable  for  triple-resonance  experiments.  While 
purification  of  the  protein  in  protonated  buffers  will  exchange  a  large  fraction  of  the 
amide  2H  for  protons,  stable  folded  proteins  usually  contain  a  significant  number  of 
amide  groups  that  are  protected  from  exchange,  and  an  unfolding/refolding  step  can 
be  included  (if  feasible)  to  get  complete  backbone  amide  exchange.  Also,  any 
experiment  that  requires  transfer  of  coherence  between  *H  and  13C  (HCCH-TOCS  Y 
and  13C-edited  NOESY,  e.g.)  cannot  be  performed  with  such  samples,  foregoing 
side  chain  XH  resonance  assignments.  To  recover  some  of  this  information,  selec- 
tively labeled  samples  can  be  prepared.  In  the  simplest  method,  the  growth  medium 
is  enriched  in  one  particular  amino  acid  with  XH  and  13C  labels,  so  that  only  these 
side  chains  show  up  in  13C-edited  spectra.  Scrambling  of  label  can  sometimes  be 
problematic,  particularly  between  amino  acids  that  are  part  of  the  same  biosynthetic 
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pathway.  However,  scrambling  usually  follows  a  predictable  pattern,  and  so  can  be 
accounted  for  in  data  analysis.  Schemes  for  the  introduction  of  stereospecific  1H,13C 
methyl  labels  at  valine,  leucine,  isoleucine,  and  alanine  have  also  been  described 
[92,  93].  These  are  particularly  useful  for  structure  determinations  in  that  nonse- 
quential NOEs  can  be  observed  and  RDCs  measured  for  specific  methyl  groups, 
providing  structural  restraints  for  large  proteins. 


5.7.3    Type-Selective,  Residue-Selective, 
and  Segmental  Labeling 

It  is  often  useful  to  introduce  labels  selectively  into  a  protein,  both  as  an  aid  to 
sequential  assignment  for  larger  proteins  and  to  characterize  a  particular  structural 
or  dynamic  feature.  Type- selective  labeling,  that  is,  labeling  one  type  of  amino  acid 
residue,  is  usually  straightforward.  For  amino  acids  that  are  not  intermediates  in  the 
synthesis  of  multiple  other  amino  acids,  introduction  of  the  labeled  amino  acids  into 
a  defined  growth  medium  at  the  point  of  expression  is  usually  sufficient  for  good 
selective  labeling  with  a  minimum  of  scrambling.  Glycine,  alanine,  isoleucine,  leu- 
cine, phenylalanine,  proline,  lysine,  valine,  and  serine  are  usually  good  candidates 
for  this  type  of  labeling  at  either  15N  or  13C  [94] .  Glutamate  and  glutamine  are  inter- 
mediates in  the  transaminase  cycle,  and  so  are  not  good  candidates  for  15N  selective 
labeling.  Cysteine  is  often  back- scrambled  to  serine,  and  in  general,  an  auxotrophic 
bacterial  strain  is  best  used  for  cysteine  labeling. 

Residue- selective  labeling  is  generally  much  more  laborious,  and  usually 
requires  the  use  of  pre-charged  tRNAs  that  recognize  the  amber  codon,  introduced 
at  the  appropriate  point  in  the  DNA  sequence  [95].  Alternatively,  double-labeling 
schemes  can  be  used  whereby  residues  can  be  assigned  sequence- specifically  via 
the  occurrence  of  a  unique  or  rare  connection  (e.g.,  V-A  or  G-S)  which  gives  rise  to 
an  observable  15N-13C  correlation.  This  scheme  has  been  used  for  identifying  resi- 
dues in  the  vicinity  of  paramagnetic  centers  that  are  otherwise  undetectable  due  to 
broadening  of  !H  signals  [96-98].  Segmental  labeling  can  be  accomplished  through 
the  use  of  semi- synthetic  protein  synthesis  (inteins)  [99],  in  which  two  or  more 
protein  domains  are  expressed  but  only  one  is  labeled.  The  domains  are  then  spliced 
together  to  generate  a  complete  protein  in  which  only  one  domain  is  labeled. 


5. 7.4    SSNMR  Sample  Requirements 

SSNMR  sample  preparation  for  MAS -based  experiments  is  in  some  ways  less 
restrictive  than  for  solution  NMR.  Because  cross-polarization  between  !H  and  het- 
eronuclear  spins  is  an  important  element  in  most  SSNMR  experiments,  such  sam- 
ples are  generally  protonated  regardless  of  molecular  mass.  The  extent  of  15N  and 
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13C  labeling  depends  upon  the  type  of  experiment,  but  a  variety  of  modern  SSNMR 
multidimensional  pulse  sequences  are  available  for  sequential  assignments  with 
uniformly  labeled  samples,  so  this  is  a  reasonable  place  to  begin.  Obviously,  solu- 
bility is  not  an  issue  for  SSNMR  sample  preparation.  However,  sample  heterogene- 
ity is  a  concern:  If  multiple  static  or  slowly  interconverting  conformers  of  a 
macromolecule  are  present  in  a  sample,  lines  will  be  broadened  and  spectra  become 
more  difficult  to  interpret.  Typically,  the  best  quality  heteronuclear  correlation 
SSNMR  data  sets  are  obtained  with  homogenous  microcrystalline  samples  [100]. 
This  insures  a  minimum  of  sample  heterogeneity  both  at  the  molecular  and  meso- 
scopic  levels. 

Static  SSNMR  on  oriented  samples  is  often  used  to  characterize  orientation  of 
helices  bound  to  membranes  (PISEMA,  [101,  102]).  For  such  experiments,  samples 
are  mechanically  oriented  in  thin  layers  between  microscope  cover  slips  and  placed 
within  the  transceiver  coil.  A  specialized  probe  (without  sample  spinning)  is 
required  for  these  types  of  experiments. 


5. 7. 5    Solvent  Suppression 

Efficient  solvent  signal  suppression  has  always  been  a  primary  concern  of  biomo- 
lecular  NMR.  Almost  all  solution  biomolecular  NMR  experiments  are  performed  in 
aqueous  media,  which  is  approximately  55  M  H20  (and  thus,  110  M  aqueous  lW.). 
A  small  fraction  (<10  %)  of  the  buffer  is  typically  D20,  which  is  used  for  spectrom- 
eter lock.  Nevertheless,  this  still  results  in  a  very  large  dynamic  range  problem,  if 
the  protons  that  need  to  be  observed  are  present  in  -0.001  M  concentration. 
Conversion  of  the  analog  RF  signal  detected  in  the  NMR  receiver  coil  to  a  digital 
number  required  for  signal  processing  is  accomplished  by  a  device  known  as  an 
ADC.  ADCs  are  defined  by  their  "bits,"  that  is,  the  number  of  binary  places  avail- 
able to  digitize  the  signal.  An  18-bit  ADC  can  generate  a  number  as  large  as  217  (one 
bit  being  reserved  for  sign),  with  a  designated  voltage  change  (AV)  required  to  flip 
a  bit  from  0  to  1 .  Thus  the  smallest  signal  that  can  be  digitized  results  in  a  voltage 
change  in  the  receiver  circuit  of  AV,  and  the  maximum  is  AVx217.  Any  signal 
inducing  a  voltage  greater  than  this  will  be  "clipped"  and  will  not  be  correctly  digi- 
tized. Typically,  the  unattenuated  signal  from  H20  is  sufficient  to  completely  fill  the 
ADC,  and  the  weak  signals  due  to  the  macromolecule  are  lost  or  distorted. 

Simply  replacing  H20  with  D20  is  usually  undesirable  except  in  special  cases, 
since  the  NH  correlations  that  are  required  for  many  multidimensional  NMR  experi- 
ments are  lost  due  to  exchange  with  the  deuterated  buffer.  Instead,  most  early  sol- 
vent suppression  was  accomplished  by  pre-saturation,  that  is,  continuous  low-power 
irradiation  of  the  water  signal  that  equalized  populations  of  the  water  proton  spin 
states  and  reducing  the  water  signal  to  a  minimum.  However,  there  are  many  impor- 
tant resonances  near  the  water  line  in  !H  spectra  that  are  also  lost  by  pre-saturation, 
and  "saturation  transfer"  via  chemical  exchange  and  spin  diffusion  leads  to  the 
weakening  of  other  signals  that  are  not  near  the  water  line  [103]. 
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An  alternative,  and  now  almost  universally  adopted  scheme  for  water  suppres- 
sion is  the  use  of  selective  excitation  pulses  that  are  constructed  in  order  to  avoid 
exciting  the  water  signal  in  the  first  place,  or  else  returning  it  to  the  +z  axis  before 
detection.  The  WATERGATE  (water  suppression  by  gradient  tailored  excitation) 
sequence  [104]  accomplishes  this  quite  nicely,  and  is  commonly  found  as  part  of  the 
final  pulse  train  on  lH  before  detection  in  many  multidimensional  NMR  experi- 
ments. Another  issue  that  often  arises  in  the  course  of  a  multi-pulse  NMR  is  that  of 
radiation  damping,  which  can  occur  when  the  H20  spin  Boltzmann  population  is 
inverted  via  a  nonselective  pulse.  If  allowed  to  remain  inverted,  a  self-induced 
coherence  analogous  to  that  observed  in  lasers  and  masers  can  occur,  generating  a 
strong  signal  in  the  receiver  coil,  resulting  in  spectral  distortion  and  receiver  over- 
load. To  avoid  radiation  damping,  multi-pulse  experiments  often  include  "flip-back" 
pulses,  which  are  selective  for  the  water  signal  and  return  it  to  the  +z  axis  [105, 
106].  Flip-back  pulses  have  the  added  benefit  of  reducing  attenuation  of  exchange- 
able XH  signals  due  to  saturation  transfer  from  water.  Finally,  PFG  selection  of  het- 
eronuclear  correlation  pathways,  found  in  most  modern  multinuclear  NMR 
experiments,  effectively  reduces  the  water  signal  of  uniformly  labeled  samples, 
even  without  other  forms  of  solvent  suppression,  since  the  water  signal  does  not 
involve  any  correlation  with  15N  or  13C. 


5. 7.6    Standard  Experiments  for  Solution  State  13 C,15N -Labeled 
Samples  (Including  TROSY  Modifications) 

As  a  rule,  the  following  set  of  3D  experiments  are  comprehensive,  and  provide  all 
of  the  data  needed  for  sequential  assignments,  chemical  shift-based  dihedral  angle, 
and  NOE-based  distance  restraints  for  proteins:  HNCA,  HN(CO)CA,  HNCO, 
HNCACB,  CBCA(CO)NH,  HCCH-TOCSY,  15N-edited  TOCSY/NOESY,  and 
13C-edited  NOESY  [21,  23,  33,  107].  Because  of  the  large  chemical  shift  range  of 
13C,  the  13C  -edited  NOESY  and  HCCH-TOCSY  experiments  can  be  run  twice,  in 
one  case  with  the  13C  frequency  centered  on  the  aliphatic  region  (-10-»70  ppm) 
and  then  centered  on  the  aromatic  region  (-70^  150  ppm).  The  appropriate  2D 
experiments  (usually  15N  and  13C  HSQC)  experiments  are  generally  interspersed 
with  the  3D  runs  to  monitor  changes  in  sample  and  to  provide  convenient  refer- 
ences for  interpreting  the  3D  datasets.  Most  of  these  experiments  are  available  and 
ready  to  run  on  standard  commercial  spectrometers. 

There  are  a  number  of  options  that  need  to  be  considered  when  choosing  which 
variants  of  a  given  experiment  are  best.  For  proteins  with  molecular  weights 
<25  kDa,  the  default  setup  is  usually  sufficient,  at  least  for  preliminary  experiments. 
However,  for  larger  proteins,  the  TROSY  option  is  usually  preferable  [66].  As 
described  earlier,  TROSY  detection  takes  advantage  of  the  fact  that  the  four  lines  of 
a  1H,  15N  multiplet  (split  by  1JNU  in  both  dimensions  of  a  1H,  15N  HSQC  experiment 
obtained  without  decoupling  of  15N  during  either  tx  or  t2)  are  differentially  broad- 
ened. The  least  efficiently  relaxed  (and  hence  narrowest  and  sharpest)  line  of  the 
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multiplet  can  be  selected  with  the  appropriate  phase  cycling.  While  the  differential 
line  broadening  increases  with  field  strength,  we  have  found  significant  improve- 
ment in  resolution  for  a  46  kDa  enzyme  even  at  600  MHz  using  TROSY-based 
detection.  For  very  large  molecules,  the  CRINEPT  experiment  takes  advantage  of 
the  almost  complete  loss  of  signal  intensity  for  the  non-TROSY  peaks  via  passive 
detection  of  the  narrowest  line,  without  phase  cycling.  However,  CRINEPT-based 
detection  is  not  commonly  used  in  3D  experiments,  since  the  experiment  works  best 
for  very  large  molecules  (>100  kDa),  and  coherence  transfer  is  difficult  to  maintain 
through  multiple  transfer  steps  [67]. 


5. 7. 7   Solution  State  Experiments  with  Deuterated  Proteins, 
Including  Direct 13 C  Observation 

If  a  protein  is  perdeuterated  (as  well  as  13C  and  15N  labeled),  the  molecular  mass 
range  of  standard  sequential  assignment  experiments  (HNCA,  HN(CO)CA, 
HNCACB,  CBCA(CO)NH,  15N-edited  NOESY)  is  considerably  extended. 
Backbone  assignments  for  a  number  of  proteins  -50  kDa  in  mass  have  been  reported, 
and  oligomeric  proteins  >100  kDa  with  internal  symmetry  have  been  assigned.  In 
such  cases,  TROSY  selection  is  appropriate,  and  the  presence  of  2H  bound  to  13C 
requires  that  2H  decoupling  be  applied  during  13C  evolution.  Because  there  are  no 
13C-bound  protons  in  such  samples,  experiments  such  as  HCCH-TOCSY  and 
13C-edited  NOESY  are  not  options.  However,  the  advent  of  cryogenically  cooled 
probes  capable  of  directly  detecting  13C  has  made  possible  another  class  of  experi- 
ments for  sequential  assignment  of  side  chain  resonances  [108].  These  include 
experiments  such  as  CON,  COCA,  and  CAN  [109-111]  for  sequential  backbone 
assignments  of  13C  and  15N,  as  well  as  CC-COSY  and  CC-TOCSY  that  can  be  used 
to  extend  assignments  to  side  chains  [112].  The  lack  of  xHs  also  make  it  possible  to 
detect  13C,  13C  NOEs  [113],  which  can  replace  XH,  !H  NOEs  as  structural  restraints. 


5. 7. 8    Sequential  Assignment  and  Structural  Determination 
Experiments  for  Nucleic  Acids 

Because  nucleic  acids  have  a  smaller  set  of  monomers  (ACTG  or  ACUG)  and  their 
tertiary  structures  are  less  varied  than  proteins,  degeneracy  is  a  more  serious  prob- 
lem in  the  sequential  assignment  of  nucleic  acid  polymers.  Also,  unlike  proteins,  in 
which  1-  and  2-bond  15N-13C  couplings  can  be  used  to  move  from  one  residue  to  the 
next  in  the  assignment  process,  nucleic  acid  polymers  are  linked  by  phosphodies- 
ters,  which  require  the  use  of  relatively  weak  2-bond  (31P-13C)  and  3 -bond  1H-31P 
and  13C-31P  couplings  for  coherence  transfer  between  monomers  [114].  This,  com- 
bined with  the  relatively  poor  dispersion  of  31P  signals  in  polynucleotides,  makes 
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the  assignment  process  generally  more  dependent  upon  13C  and  15N-edited 
NOESY  experiments  [30,  115].  On  the  other  hand,  because  nucleic  acids  are 
straightforward  to  synthesize  in  an  automated  fashion,  it  is  easy  to  include  sequence- 
specific  labels.  For  structural  determinations  of  multidomain  RNA  molecules, 
assignments  and  structures  are  determined  for  single  domains,  and  then  the  domains 
are  combined  for  characterization  of  the  complete  structure  [28]. 


5. 7.9    Residual  Dipolar  Coupling  and  Diffusion  Measurements 

RDC-based  restraints  are  now  a  standard  part  of  NMR-based  structural  determina- 
tions. In  the  simplest  cases,  RDCs,  which  are  the  most  straightforward  to 
obtain  and  analyze,  can  be  measured  by  comparison  of  2D  spectra  (often  an  HSQC) 
obtained  in  isotropic  and  aligned  media  without  decoupling  of  15N  during  acquisi- 
tion of  the  !H  signal  (or  alternatively,  without  refocusing  of  NH  coupling  during  15N 
evolution).  In  the  isotropic  medium,  each  NH  correlation  is  split  in  the  appropriate 
dimension  by  (usually  -92-94  Hz).  Upon  alignment,  the  observed  splitting  is 
modulated  by  the  RDC,  which  is  then  obtained  by  comparison  with  the  splitting  of 
the  same  correlation  in  the  unaligned  spectrum.  Of  course,  as  this  approach  results 
in  doubling  the  number  of  peaks  in  the  normal  HSQC  spectrum  and  line  broadening 
usually  increases  in  the  presence  of  the  aligning  medium,  this  approach  becomes 
unwieldy  for  larger  proteins.  In  that  case,  the  use  of  a  combination  of  TROSY/semi- 
TROSY  or  IPAP  (in-phase  anti-phase)  spectra  yields  a  set  of  HSQC  spectra  that  are 
offset  from  each  other  by  the  coupling  constants  of  the  correlated  spins  from  each 
NH  group.  This  reduces  the  complexity  of  the  spectra  to  that  of  a  typical  decoupled 
HSQC  or  TROSY  spectrum,  reducing  overlap  and  allowing  for  easier  peak  picking 
and  analysis.  RDC  measurements  are  by  no  means  limited  to  NH  pairs.  IPAP  exper- 
iments have  been  described  to  measure  a  wide  variety  of  RDCs,  including  Coc-C, 
N-C,  and  N-Coc  for  proteins  and  imino  NH  and  aromatic  CH  RDCs  for  nucleic 
acids  [116,  117].  RDCs  are  particularly  useful  for  nucleic  acid  structural  work  due 
to  the  lack  of  large  numbers  of  long-range  restraints. 

Very  early  on,  it  was  observed  by  Hahn  that  it  should  be  possible  to  measure  dif- 
fusion by  NMR  via  the  application  of  field  gradients,  in  the  same  classic  paper  in 
which  spin-echo  and  stimulated  echo  experiments  are  described  [118].  However,  it 
was  some  years  before  a  practical  realization  of  this  experiment  was  described,  using 
a  time-dependent  gradient  [119],  and  the  commercial  availability  of  PFGs  made  the 
experiment  practical  on  most  spectrometers  [75].  For  macromolecules,  a  limitation 
of  NMR-based  diffusion  measurement  is  that  as  the  diffusing  species  increases  in 
size  and  diffuses  more  slowly,  the  gradient  strength  required  for  accurate  measure- 
ment also  increases.  For  most  commercial  spectrometers,  safe  maximum  gradient 
strengths  are  often  too  weak  for  accurate  diffusion  measurements  on  macromole- 
cules. However,  an  interesting  application  of  diffusion-based  measurements  is  to 
characterize  tight-binding  fragments  to  the  target  macromolecule  of  choice  by 
observing  relative  diffusion  rates  of  the  fragments  [120].  Combined  with  other 
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methods,  such  as  saturation  transfer  surface  mapping  [121],  "pharmacophores" 
(small  molecules  that  can  be  combined  into  pharmacologically  active  compounds) 
can  be  screened  to  aid  in  drug  design. 


5.8    Data  Analysis 

5. 8. 1    Data  Processing  and  Assignment  Software 

NMR  data  analysis  software  can  be  classified  roughly  into  three  types,  raw  data 
processing  (that  is,  applying  digital  after-processing  such  as  linear  prediction,  win- 
dow functions  for  optimizing  data  analysis,  Fourier  transformation  of  the  data  and 
phasing),  resonance  assignment  (including  automated  and  semi-automated  assign- 
ment tools)  and  structural/dynamic  modeling  from  the  assigned  NMR  data.  Some 
packages  encompass  more  than  one  of  the  functions,  but  usually  at  the  expense  of 
compromising  some  functionality.  Our  discussion  of  such  programs  is  necessarily 
incomplete,  as  much  of  the  software  is  freeware  (or  close  to  freeware)  and  new  pro- 
grams are  introduced  while  old  ones  are  abandoned  fairly  often.  Still,  most  NMR 
spectroscopists  find  a  package  that  they  are  comfortable  with,  and  stick  with  it,  so 
there  are  often  user  groups  and  wikis  associated  with  the  various  packages,  even 
obsolete  ones. 

All  spectrometer  vendors  (Bruker  and  Varian/Agilent  being  the  most  commonly 
used  for  biophysical  applications  at  the  time  of  this  writing)  provide  software  with 
the  instrument  that  controls  data  acquisition  and  provides  reasonable  processing 
capability.  However,  for  users  with  multiple  NMR  platforms,  it  is  often  useful  to 
process  raw  NMR  data  using  a  single  program  that  can  interpret  data  formats  from 
different  instruments  and  provide  a  common  output  format.  NMRPipe  [122],  origi- 
nally provided  by  the  NMR  groups  at  the  National  Institutes  of  Health,  is  capable  of 
advanced  processing  that  produce  spectra  that  in  turn  can  be  read  by  a  variety  of 
analysis  software.  NMRPipe  has  a  relatively  simple  graphical  user  interface  (GUI), 
NMRDraw,  that  allows  interaction  to  establish  optimum  processing  parameters. 
NMRPipe  has  the  advantage  of  being  free  to  academic  users,  and  is  easily  mounted 
on  UNIX  and  Linux  operating  systems.  A  more  extensive  GUI  is  provided  by  Felix 
[123],  a  distant  descendant  of  one  of  the  first  stand-alone  NMR  processing  pro- 
grams, Dennis  Hare's  FTNMR  [124].  Felix  incorporates  peak-picking  algorithms  as 
well  as  interactive  processing  tools  and  multidimensional  spectrum  visualization. 

The  next  step  in  analysis,  sequential  resonance  assignment  and  tabulation  of 
structural  and  dynamic  restraints,  can  be  accomplished  using  a  variety  of  programs, 
again  each  with  strengths  and  weaknesses.  The  choice  of  program  is  a  matter  of 
personal  preference.  An  early  data  analysis  program,  XEASY,  was  developed  in  the 
Wuthrich  laboratory  at  the  ETH  in  Zurich,  and  provided  considerable  flexibility  in 
viewing  and  analyzing  multidimensional  NMR  data  [125].  The  XEASY  suite  has 
since  been  replaced  by  CAR  A,  which  has  more  flexible  project  management  capa- 
bility, runs  on  a  wider  range  of  platforms,  and  is  still  supported  [126].  Another 
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commonly  used  freeware  analysis  program  is  SPARKY  [127],  developed  at  the 
University  of  California  San  Francisco.  Originally  developed  for  analyzing  nucleic 
acid  NMR  data,  SPARKY  has  been  adapted  reasonably  well  for  the  analysis  of 
protein  NMR  data,  but  strip  plot  functionality  (used  for  analysis  of  3D  NMR  data) 
is  somewhat  limited.  If  one  is  willing  to  pay  for  analysis  software,  there  are  a  num- 
ber of  useful  packages  available.  The  popular  NMR  view  program,  originally  devel- 
oped as  freeware,  is  now  offered  in  supported  form  for  academic  users  [128].  Felix 
remains  a  powerful  analysis  program,  and  was  recently  spun  off  by  Accelrys. 

Over  the  years,  a  great  deal  of  effort  has  been  spent  on  development  of  automated 
or  semi- automated  assignment  methods  for  multidimensional  NMR  spectra  [129, 
130].  While  there  has  been  some  success  in  this  area  [131],  particularly  with  smaller 
proteins,  completely  automated  resonance  assignments  for  larger  proteins  is  still  not 
possible.  A  major  part  of  the  problem  is  the  accurate  determination  of  what  in  a 
given  spectrum  constitutes  a  peak  (as  opposed  to  noise)  and  how  the  program  dis- 
tinguishes peak  identities  in  regions  of  significant  spectral  overlap.  In  many  cases, 
the  human  eye  is  still  the  best  judge  of  whether  a  peak  is  (a)  really  a  peak  and  (b) 
where  the  maximum  of  the  peak  lies.  Even  for  fairly  well  resolved  spectra  of  bio- 
molecules,  the  results  of  automatic  peak  picking  (available  in  all  of  the  standard 
processing  and  analysis  packages)  should  always  be  at  least  spot-checked  by  eye  for 
accuracy. 

Once  peaks  have  been  picked,  the  resulting  peak  lists  can  be  incorporated  into 
automated  assignment  routines.  Again,  a  wide  variety  of  automated  assignment  pro- 
grams are  available,  each  with  a  defined  minimum  set  of  experiments  that  are 
required.  A  thorough  review  of  these  programs  is  beyond  the  scope  of  this  article; 
the  field  is  changing  fairly  rapidly  with  ongoing  improvements  in  the  extent  to  which 
assignments  can  be  made  by  automated  methods  and  increased  flexibility  of  required 
input  data.  Given  that  most  of  the  programs  are  based  on  pattern  recognition  for 
particular  residue  types,  the  availability  of  accurate  and  relatively  complete  peak 
lists  is  critical  in  determining  the  degree  to  which  the  programs  are  successful. 

To  save  time  in  structural  genomics  efforts,  it  might  be  desirable  to  avoid  the 
sequential  assignment  process  altogether,  and  go  directly  to  the  structure  determina- 
tion step  [132].  (The  author  admits  to  serious  philosophical  arguments  with  this 
approach,  since  one  learns  a  lot  about  one's  pet  protein/nucleic  acid  in  the  process 
of  doing  assignments!)  Nevertheless,  using  sequence  and  chemical  shift  informa- 
tion, supplemented  by  NOEs  and/or  RDC  data,  it  has  been  demonstrated  with  sim- 
ple systems  that  it  is  possible  to  generate  families  of  structures  based  on  structural 
homology  and  statistically  likely  fits  of  the  experimental  data  [133]. 


5.8.2    Structural  and  Dynamic  Analysis  of  NMR  Data 

Once  sequential  assignments  have  been  obtained,  the  extraction  of  useful  structural 
and  dynamic  information  from  acquired  data  can  begin  in  earnest.  The  first  NMR 
structural  determinations  were  carried  out  using  distance  geometry,  using  NOEs  to 
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represent  distances  between  pairs  of  !Hs,  which  were  used  to  determine  unknown 
distances  by  determining  the  geometric  relationships  between  different  pairs  with 
known  and  unknown  distances  [134].  While  distance  geometry  calculations  are 
available  as  part  of  many  computational  packages,  and  often  provide  a  useful  front 
end  for  structural  refinement,  the  vast  majority  of  NMR-based  structural  refine- 
ments are  now  carried  out  using  simulated  annealing,  a  form  of  restrained  molecular 
dynamics  (MD).  As  in  all  MD  simulations,  kinetic  energy  is  randomly  distributed 
(as  a  function  of  the  simulation  temperature)  to  individual  atoms  in  the  simulation. 
This  randomly  assigned  energy  imparts  a  finite  speed  to  the  atom  as  the  simulation 
begins.  All  of  the  forces  acting  on  the  individual  atoms  are  calculated  and  a  velocity 
assigned.  After  allowing  motion  for  a  very  short  time  (~ps),  the  forces  are  recalcu- 
lated before  the  next  step.  Unlike  standard  MD  simulations,  SA  is  not  intended  to 
reproduce  actual  motions  of  the  molecule  in  question.  Instead,  the  molecule  is  raised 
(computationally)  to  a  very  high  temperature,  so  that  energetic  barriers  between  the 
starting  structure  and  the  "correct"  structure  can  be  overcome.  As  the  simulation 
proceeds,  the  temperature  is  gradually  lowered  and  experimental  restraints  applied 
more  strongly.  Restraints  are  defined  in  terms  of  penalty  functions,  that  is,  a  viola- 
tion of  the  restraint  raises  the  energy  of  the  system.  Although  Cartesian  dynamics 
(that  is,  motion  is  calculated  in  three-dimensional  space)  is  still  commonly  used, 
torsional  dynamics,  in  which  rotation  around  appropriate  bonds  provide  the  degrees 
of  freedom  for  motion,  is  computationally  more  efficient  and  is  often  used  for  cal- 
culating structures  from  NMR  data. 

Multiple  computational  platforms  are  available  for  SA  calculations.  Some  of  the 
oldest  programs,  including  AMBER  [135]  and  XPLOR  (now  XPLOR-NIH)  [136] 
are  still  actively  supported,  with  new  releases  matching  developments  in  the  field. 
Given  that  these  programs  were  first  written  when  GUIs  were  nonexistent,  graphical 
interfaces  are  for  the  most  part  add-ons.  However,  XPLOR-NIH  can  be  interfaced 
with  the  visualization  software  VMD  [137],  and  AMBER  supports  a  graphical  mol- 
ecule building  interface,  Xleap.  Python  scripting  is  preferred  for  input  in  the  newer 
releases  of  XPLOR-NIH.  A  menu-driven  descendant  of  XPLOR,  CNS,  is  also  in 
common  use  [138].  Most  molecular  graphics  and  visualization  software  can  be  used 
to  examine  the  results  of  XPLOR  and  AMBER  calculations,  often  by  simply  export- 
ing the  results  as  a  PDB  (protein  data  base)  format  file.  DYANA  is  another  widely 
used  package,  and  combined  with  automated  NOE  assignment  methods  it  forms  the 
basis  of  the  CYANA  program,  which  iteratively  assigns  NOEs  and  generates  test 
structures  based  on  those  assignments  [139-141]. 

The  most  common  restraint  used  in  structure  determination  is  the  NOE-based 
distance.  Each  identified  NOE  is  converted  to  a  distance,  and  if  the  two  restrained 
atoms  deviate  from  that  distance,  the  energetic  penalty  is  applied.  The  NOE  can  be 
calibrated  fairly  precisely  by  measuring  the  NOE  as  a  function  of  the  buildup  time 
(%NOE  versus  £mix),  so  these  distances  can  be  constrained  quite  tightly.  However,  it 
is  generally  more  useful  to  divide  the  NOE  into  several  classes  (weak,  medium  and 
strong),  and  choose  an  approximate  distance  for  each  class  (e.g.,  strong  <3  A, 
medium  <4  A  and  weak  <5  A).  There  are  several  reasons  why  this  approximation  is 
appropriate.  First,  multiple  relaxation  pathways  and  spin  diffusion  can  attenuate  the 
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NOE  between  protons  in  ^-rich  regions  relative  to  those  in  ^-poor  regions  of  the 
macromolecule,  making  a  single  calibration  difficult  to  use.  Furthermore,  if  an  NOE 
is  incorrectly  assigned,  the  more  stringent  restraints  magnify  the  introduced  error. 
Finally,  if  there  is  increased  dynamics  in  one  region  of  the  macromolecule  relative 
to  another,  a  stringent  distance  restraint  may  mask  local  conformational  flexibility. 
Generally,  it  is  better  to  have  many  loosely  constrained  distances  (many  NOEs)  as 
opposed  to  a  few,  strongly  restrained  distances. 

Another  important  source  of  restraints  are  those  on  backbone  and  side  chain  dihe- 
dral angles.  Traditionally,  these  were  obtained  by  measuring  coupling  constants  of 
spins  across  the  angle  of  interest,  and  applying  the  appropriate  Karplus  relationship  to 
obtain  an  allowable  range  of  dihedral  angles  [142].  For  nucleic  acid  structure  determi- 
nations, the  accurate  measurement  of  3/Cp  couplings  across  phosphodiester  linkages 
provide  critical  information  regarding  DNA  and  RNA  backbone  conformations  [143]. 
However,  in  recent  years,  the  availability  of  a  large  database  of  chemical  shift  assign- 
ments for  known  protein  structures  has  made  it  possible  to  predict  dihedral  angles 
based  solely  on  the  chemical  shifts  of  the  atoms  involved.  One  algorithm  for  extract- 
ing dihedral  angle  information  for  polypeptides  from  chemical  shifts  is  TALOS+, 
which  bases  predictions  on  a  sliding  three-residue  frame  [144].  Generally,  the  more 
shifts  that  are  assigned  (e.g.,  1H,  15N,  13C)  the  more  reliable  the  predictions.  Some  MD 
packages  (XPLOR-NIH,  for  example)  can  incorporate  the  shift  data  directly,  without 
need  for  preparing  an  explicit  set  of  angular  restraints.  !H  chemical  shift  data  can  also 
be  used  quantitatively  for  determining  the  relative  orientation  and  distance  between  a 
shifted  !H  spin  and  nearby  aromatic  rings.  Aromatic  rings  often  induce  fairly  dramatic 
!H  chemical  shift  changes  due  to  "ring  currents,"  with  de-shielding  occurring  in  the 
plane  of  the  ring  and  shielding  above  and  below  the  plane.  The  SHIFTS  program  can 
incorporate  this  information  as  distance  and  angle  restraints  in  AMBER  [145]. 

RDCs  have  in  recent  years  become  a  critical  component  of  NMR  structure  deter- 
mination. Unlike  other  NMR-based  restraints,  RDCs  provide  information  regarding 
the  relative  orientations  of  internuclear  vectors  with  respect  to  a  single  frame  of 
reference  provided  by  the  alignment  tensor.  In  a  sense,  RDCs  provide  a  "whole- 
molecule"  perspective  that  is  absent  from  other  types  of  NMR-based  restraints. 
RDCs  are  of  particular  value  for  multi-domain  and  extended  structures,  where  small 
errors  in  local  restraints  might  lead  to  much  larger  displacements  remotely.  The  use 
of  RDCs  in  structural  calculations  not  only  requires  the  measured  values  of  the 
RDCs  as  input,  but  the  principal  components  of  the  alignment  tensor  as  well.  If  a 
preliminary  or  model  structure  is  known,  a  starting  tensor  can  be  calculated  using 
singular  value  decomposition  (SVD),  in  which  the  alignment  tensor  components  are 
considered  as  unknown  but  overdetermined.  Alternatively,  for  a  completely 
unknown  structure,  the  tensor  elements  can  be  calculated  using  histograms  by  grid- 
search  or  simplex  fitting.  In  either  case,  once  the  tensor  components  are  determined, 
they  are  iteratively  refined  in  the  course  of  the  structure  calculations,  ideally  result- 
ing in  a  best-fit  structure  and  tensor  for  the  experimental  RDCs.  A  number  of  pro- 
grams are  available  for  accomplishing  these  tasks,  including  REDCAT  [146]  and 
PALES  [147].  If  more  than  one  alignment  medium  is  used  to  measure  RDCs,  sepa- 
rate alignment  tensors  are  required  for  each  data  set. 
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5.8.3   Application  ofNMR  Spectroscopy  to  Fundamental 
Questions  of  Biophysics 

As  the  reader  is  by  now  aware,  NMR  spectroscopy  is  a  powerful  tool  for  character- 
izing structural  and  dynamic  features  of  macromolecules,  with  applicability  to 
questions  of  protein  folding  [148,  149],  ligand  [150],  cofactor  [77],  effector  [94] 
and  substrate  binding  [151-153],  and  characterization  of  modifications  and  muta- 
tions [76] .  This  chapter  is  not  intended  to  be  a  comprehensive  survey  of  the  applica- 
tions of  NMR  spectroscopy  to  biophysical  research  problems.  Such  a  list  would 
include  thousands  of  references  and  would  still  be  incomplete.  We  hope  that  we 
have  provided  the  reader  with  the  information  needed  to  evaluate  the  usefulness  of 
NMR  spectroscopy  in  their  own  research,  and  that  they  will  seek  out  local  expertise 
for  help  in  designing  and  performing  the  appropriate  NMR  experiments. 
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Chapter  6 

Electron  Paramagnetic  Resonance 
Spectroscopy 

John  H.  Golbeck  and  Art  van  der  Est 


Abstract  Electron  paramagnetic  resonance  (EPR)  spectroscopy  is  widely  used  to 
study  proteins  that  contain  naturally  occurring  paramagnetic  centers  and/or  artifi- 
cially introduced  spin  labels.  In  this  chapter  we  present  a  mainly  qualitative  over- 
view of  the  application  of  EPR  spectroscopy  to  the  study  biological  systems.  The 
chapter  begins  with  a  short  description  of  the  physical  principles  underlying  the 
method  and  the  basic  experimental  techniques.  An  overview  of  characteristic  line- 
shapes  observed  under  various  experimental  conditions  is  then  presented  to  show 
how  quantities  such  as  hyperfine  couplings,  g-anisotropy  and  zero-field  splitting 
manifest  themselves  in  EPR  data.  A  number  of  specific  examples  are  used  to  illus- 
trate how  these  quantities  can  be  used  to  obtain  information  about  the  geometry, 
bonding,  electronic  structure,  etc.  of  biological  systems. 
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coupling  •  Hyperfine  coupling  •  Organic  radicals  •  Metalloproteins  •  Molecular  trip- 
let states  •  Light-induced  radical  pairs 


J.H.  Golbeck  (M) 

Department  of  Biochemistry  and  Molecular  Biology,  The  Pennsylvania  State  University, 
University  Park,  PA  16802,  USA 

Department  of  Chemistry,  The  Pennsylvania  State  University, 
University  Park,  PA  16802,  USA 
e-mail:  jhg5@psu.edu 

A.  van  der  Est 

Department  of  Chemistry,  Brock  University,  St.  Catharines,  ON,  Canada,  L2S  3A1 
e-mail:  avde@brocku.ca 


N.M.  Allewell  et  al.  (eds.),  Molecular  Biophysics  for  the  Life  Sciences, 
Biophysics  for  the  Life  Sciences  6,  DOI  10.1007/978-l-4614-8548-3_6, 
©  Springer  Science+Business  Media  New  York  2013 


175 


176 


J.H.  Golbeck  and  A.  van  der  Est 


6.1  Introduction 


Electron  paramagnetic  resonance  (EPR)  detects  the  magnetic  susceptibility  associ- 
ated with  unpaired  electrons  in  atoms  or  molecules.  The  presence  of  unpaired  elec- 
trons is  a  rather  unusual  condition,  as  most  stable  molecules  have  filled  electron 
shells  and  the  Pauli  Exclusion  Principle  specifies  that  their  electrons  be  paired. 
Nevertheless,  four  important  exceptions  occur  in  biological  systems.  The  transition 
metals  typically  have  partially  filled  J- shells  and  many  of  them  are  paramagnetic  in 
one  or  more  of  their  oxidation  states.  The  first  row  transition  elements  V,  Mn,  Fe, 
Co,  Ni,  Cu,  the  second  row  transition  element  Mo  and  the  third  row  transition  ele- 
ment W  are  found  in  metalloproteins.  The  metastable  intermediates  found  in  photo- 
active proteins  and  radical  enzymes  form  a  second  class  of  paramagnetic  species 
found  in  nature.  These  intermediates  include  excited  triplet  states,  carbon-,  oxygen-, 
and  sulfur-centered  radicals  and  biradicals  such  as  formed  in  photosynthetic  systems. 
Molecular  oxygen,  which  makes  up  -20  %  of  the  atmosphere,  also  has  a  paramag- 
netic triplet  ground  state.  Finally,  stable  nitroxide  radical-based  spin  labels,  such  as 
5-(2,2,5,5-tetramethyl-2,5-dihydro-lH-pyrrol-3-yl)methyl  methanesulfonothioate 
(MTSL),  can  be  incorporated  into  proteins  and  used  as  probes  of  the  molecular  envi- 
ronment. Here,  we  give  a  brief,  largely  qualitative  introduction  to  EPR  spectroscopy 
in  which  we  will  use  examples  from  three  of  these  classes  of  systems  to  illustrate  the 
physical  principles  of  the  method  and  its  application  in  biophysics.  For  more 
in-depth  treatments  of  EPR,  readers  are  referred  to  the  excellent  introductory  text- 
books by  Weil,  Bolton,  and  Wertz  (Electron  paramagnetic  resonance,  elementary 
theory  and  practical  applications,  Wiley,  New  York)  [1],  Atherton  (Principles  of 
electron  spin  resonance,  Ellis  Horwood/Prentice  Hall,  New  York)  [2],  and  Hagen 
(Spectroscopy,  CRC  Press,  Boca  Raton,  FL)  [3]  as  well  as  overviews  of  practical 
aspects  of  the  technique  by  Brustolon  and  Giamello  (Electron  paramagnetic  reso- 
nance: a  practitioner's  toolkit,  Wiley,  Hoboken,  NJ)  [4]  and  advanced  texts  on  pulsed 
EPR  by  Schweiger  and  Jeschke  (Principles  of  pulsed  electron  paramagnetic  reso- 
nance, Oxford  University  Press,  Oxford)  [5],  high  field  EPR  by  Mobius  and  Savitsky 
(High  field  EPR  spectroscopy  on  proteins  and  their  model  systems:  characterization 
of  transient  paramagnetic  states,  RSC,  Cambridge)  [6],  and  on  spin  labeling  by 
Berliner  (Spin  labeling,  the  next  millenium,  Plenum  Press,  New  York)  [7]. 


6.2    Physical  Principles 
6.2.1    Basic  Principle  of  EPR 

Magnetism  is  the  result  of  the  motion  of  charges.  On  a  macroscopic  level  a  current 
passing  through  a  loop  generates  a  magnetic  field  perpendicular  to  the  plane  of  the 
loop.  On  a  microscopic  scale,  a  circulating  negatively  charged  particle  generates  a 
magnetic  moment,  /2,  that  is,  antiparallel  to  the  angular  momentum  vector  /  as 
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Fig.  6.1  (a)  Magnetic  moment  and  angular  momentum  of  a  circulating  negatively  charged  parti- 
cle, (b)  Zeeman  energy  levels  for  a  free  electron 


shown  in  Fig.  6.  la.  Free  electrons  do  not  circulate  but  have  intrinsic  angular  momen- 
tum; S  referred  to  as  spin  and  the  magnetic  moment  of  a  free  electron  is  antiparallel 
to  its  spin  angular  momentum.  The  two  quantities  are  related  by: 

fi  =  -g.PA  (6-D 

where  ge  is  the  free  electron  g-factor  (2.0023193)  and  /?e  is  the  electron  Bohr  mag- 
neton (9.27401  x  10-24  J  T-1).  Bound  electrons,  on  the  other  hand,  have  both  orbital 
and  spin  angular  momentum  and  both  of  these  contribute  to  the  magnetic  moment. 
In  light  atoms  and  organic  molecules,  the  orbital  angular  momentum  is  small  and 
hence  the  magnetic  moment  is  usually  discussed  in  terms  of  the  spin  angular 
momentum.  In  the  presence  of  a  magnetic  field,  B ,  a  magnetic  moment  experiences 
an  interaction  energy  that  depends  on  the  relative  orientation  of  j2  and  B.  Classically, 
the  interaction  energy  is  given  by: 

E  =  -B0/llz,  (6.2) 

where  B0  is  the  magnitude  of  the  magnetic  field,  which  is  defined  to  lie  along  the 
z-direction. 

Quantum  mechanically,  the  Hamiltonian  operator  for  the  energy  is  obtained  by 
replacing  /az  in  (6.2)  with  the  corresponding  operator  juz: 

H  =  -B0fiz=gJeB0Sz,  (6.3) 

where  Sz  is  the  operator  for  the  z-component  of  the  spin  angular  momentum. 
Solving  the  Schrodinger  equation  for  the  energy  gives: 


(6.4) 
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For  a  free  electron,  the  quantum  number  ms  for  the  z-component  of  the  spin  angu- 
lar momentum  can  take  values  of  +1/2  and  -1/2.  Thus,  in  a  magnetic  field  a  free 

1  1 
electron  has  two  energy  states  with  E  =  +  —  ge/3eB0  and  E  =  — ge/3eB0as  shown  in 

Fig.  6.1b.  The  interaction  between  the  magnetic  moment  and  the  field  is  known  as 
the  Zeeman  interaction  and  the  basic  principle  of  EPR  spectroscopy  is  the  measure- 
ment of  the  absorption  of  electromagnetic  radiation  as  a  result  of  transitions  between 
the  Zeeman  energy  levels.  For  atoms  and  molecules,  there  are  additional  interac- 
tions (hyperfine  coupling,  dipolar  coupling,  zero-field  splitting,  etc.)  that  lead  to 
extra  terms  in  the  Hamiltonian  operator  and  cause  shifts  and/or  splittings  of  the 
energy  levels.  The  effect  of  these  terms  on  the  EPR  spectrum  will  be  described 
below  when  the  characteristic  lineshapes  are  described.  From  a  biophysics  point  of 
view,  the  interactions  that  lead  to  the  additional  terms  are  extremely  important 
because  they  depend  on  the  local  environment  of  the  unpaired  electrons  and  hence, 
EPR  spectra  contain  information  about  geometry,  bonding,  electronic  structure, 
etc.,  of  the  paramagnetic  species.  Examples  of  the  information  contained  in  EPR 
data  will  be  presented  in  the  Sect.  6.5  of  this  chapter.  First,  however,  we  give  a  brief 
description  of  the  EPR  experiment  and  introduce  the  concept  of  magnetization  that 
is  important  for  understanding  pulsed  experiments  and  relaxation.  This  is  followed 
by  a  short  discussion  of  several  important  experimental  aspects  of  EPR. 


6.2.2    The  EPR  Experiment 

Figure  6.2  shows  a  schematic  diagram  of  the  main  components  of  a  continuous 
wave  EPR  spectrometer  using  lock-in  detection.  The  microwave  components  are 
contained  in  a  microwave  bridge  indicated  by  the  dashed  line  in  Fig.  6.2.  The  micro- 
wave source  is  typically  either  a  klystron  or  a  Gunn  diode  and  the  microwaves  exit- 
ing the  source  are  split  into  a  signal  arm  and  a  reference  arm.  The  signal  arm  has  an 
attenuator  to  adjust  the  microwave  power  reaching  the  sample  and  a  circulator  is 
used  to  direct  microwaves  to  the  resonator,  which  is  mounted  in  the  magnet. 
Reflected  power  from  the  resonator  passes  through  the  circulator  to  detector.  The 
reference  arm  has  a  phase  shifter  that  allows  the  relative  phase  of  the  microwaves  in 
the  two  arms  to  be  varied.  The  microwave  signals  from  the  two  arms  are  then  mixed 
and  converted  to  a  DC  signal,  or  slowly  oscillating  signal,  by  the  detector,  which  is 
typically  either  a  diode  or  a  mixer.  The  signal  from  the  detector  is  then  amplified  by 
a  preamplifier  and  fed  to  a  lock-in  amplifier.  The  lock-in  amplifier  modulates  the 
EPR  signal  by  modulating  the  magnetic  field  using  a  set  of  modulation  coils 
mounted  on  the  resonator  and  it  amplifies  only  those  signals  from  the  bridge  that 
oscillate  at  the  modulation  frequency.  The  output  of  the  lock-in  amplifier  is  recorded 
as  the  magnetic  field  is  swept  through  a  region  of  interest  to  generate  the  EPR 
spectrum. 
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Fig.  6.2  Schematic  diagram  of  a  basic  continuous  wave  EPR  spectrometer  using  field  modulation 
detection 


6.2.3  Magnetization 

In  a  typical  EPR  experiment,  the  sample  contains  on  the  order  of  1016  spins  and  it  is 
useful  to  describe  the  ensemble  average  behavior  of  the  system.  When  a  paramag- 
netic substance  is  placed  in  a  magnetic  field,  it  becomes  magnetized  and  EPR  exper- 
iments can  be  understood  in  terms  of  the  response  of  the  magnetization  to  the 
applied  electromagnetic  radiation.  The  magnetization  of  the  sample  is  defined  as  the 
ensemble  average  magnetic  moment.  At  thermal  equilibrium,  the  x  and  y  compo- 
nents of  the  magnetic  moment  average  to  zero  and  the  magnetization  is  the  average 
magnetic  moment  along  the  magnetic  field, 

M  =  pj2  =  pjlz,  (6.5) 

where  p  is  the  number  of  spins  per  unit  volume.  Using  the  Boltzmann  distribution 
to  calculate  Jlz  gives: 

a^ims,o  (66) 
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Fig.  6.3  Behavior  of  the 
magnetization  in  an  EPR 


z 


experiment  in  the  laboratory 


frame  and  in  the  rotating 
frame 


o 


y 


X 


where  S  is  the  spin  quantum  number,  /m  is  the  magnetic  susceptibility,  //r  and  fi0  are 
the  relative  and  free  space  permeabilities,  respectively.  Equation  (6.6)  shows  that 
the  magnetic  susceptibility  depends  on  the  inverse  temperature,  which  is  known  as 
the  Curie  law.  In  an  EPR  experiment  the  response  of  the  magnetization  (or  magnetic 
susceptibility)  to  the  oscillating  field  of  the  electromagnetic  radiation  is  measured. 
Experimentally,  the  microwaves  are  applied  to  the  sample  such  that  the  magnetic 
field  component  oscillates  in  a  plane  perpendicular  to  the  static  field.  The  response 
of  the  system  is  measured  by  phase- sensitive  detection  of  the  microwaves  in  this 
plane.  The  in-phase  response  is  defined  as/'  and  the  response  90°  out  of  phase  is/". 
As  shown  in  Fig.  6.2,  the  microwave  signal  from  the  sample  is  mixed  with  a  refer- 
ence microwave  beam  to  produce  a  DC  (or  slowly  oscillating)  signal.  Mathematically, 
this  is  equivalent  to  observing  the  x  or  y  component  of  the  magnetization  in  a  frame 
of  reference  rotating  at  the  frequency  of  the  reference  beam.  Vector  diagrams  of  the 
behavior  of  the  magnetization  in  the  rotating  frame,  such  as  shown  in  Fig.  6.3,  are 
useful  for  understanding  the  observed  signals  especially  in  pulsed  EPR  experi- 
ments. Initially,  without  the  microwave  field  present  (Fig.  6.3a)  the  magnetization, 
M0,  is  aligned  along  the  static  field,  B0,  in  the  z-direction.  When  the  microwave  field, 
Bu  is  applied  perpendicular  to  the  external  field,  the  magnetization  precesses  about 
the  sum  of  the  two  fields.  This  results  in  the  spiral  motion  of  the  magnetization  is 
shown  in  Fig.  6.3b.  If  we  transform  to  a  frame  of  reference  rotating  with  the  micro- 
wave frequency  (Fig.  6.3c),  the  microwave  field  is  now  static  and  if  the  phase  is 
chosen  so  that  Bx  is  along  the  negative  x'-direction,  the  magnetization  precesses 
towards  the  y'  direction.  With  the  phase  of  the  detection  set  to  y'  the  precession  of 
the  magnetization  results  an  increase  of  the  observed  signal.  However,  as  the  mag- 
netization continues  to  precess  the  signal  decreases,  becomes  negative,  then  positive 
again,  and  so  on,  in  an  oscillatory  fashion.  This  oscillatory  motion  of  the  magnetiza- 
tion about  the  microwave  field  in  the  rotating  frame  is  referred  to  as  nutation. 
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6.3  Experimental  Techniques 

6.3.1    Choice  of  Magnetic  Field  and  Microwave  Frequency 

The  frequency  of  the  radiation  needed  to  cause  a  transition  between  the  energy 
levels  of  a  free  electron  is  v  =  AEIh  =  gQj5QB0lh  .  The  choice  of  B0  and  v  is  governed 
largely  by  available  magnet  and  microwave  or  far  infrared  technology.  Microwave 
sources  and  components  are  generally  produced  for  specific,  relatively  narrow  fre- 
quency bands  each  designated  by  a  letter.  X-band  components  operate  at  -9.3  GHz 
and  inexpensive  klystron  or  Gunn  diode  sources  are  available.  The  corresponding 
magnetic  field,  #0~0.33  T,  can  be  easily  produced  by  a  moderately  low  power/cur- 
rent electromagnet.  In  general,  the  g-value  depends  on  the  orientation  of  the  mole- 
cule with  respect  to  the  magnetic  field.  Hence,  low  temperature  EPR  spectra  of 
randomly  ordered  samples  are  broad  and  have  features  corresponding  to  molecules 
at  different  orientations.  The  resolution  of  these  features  depends  on  the  field/fre- 
quency combination  that  is  used.  Commercial  spectrometers  are  now  available  at 
the  following  frequencies  (along  with  their  coded  designations):  1.2  GHz  (L-band), 

2.4  GHz  (S-band),  9.3  GHz  (X-band),  34  GHz  (Q-band),  94  GHz  (W-band),  and 
263  GHz  (mm-band).  X-band  spectrometers  remain  by  far  the  most  popular  because 
of  their  comparatively  low  cost  and  ease  of  operation  while  providing  sufficient 
resolution  for  metalloproteins.  Q-band  spectrometers  are  also  widely  used  because 
they  allow  the  spectra  of  some  organic  radicals  to  be  partially  resolved  at  a  field 
strength  that  is  achievable  with  an  electromagnet.  The  higher  frequency  bands 
(W-band  and  higher)  require  superconducting  magnets  but  provide  an  order  of  mag- 
nitude better  spectral  resolution  compared  to  X-band.  In  practice,  the  most  appro- 
priate frequency  band  for  a  particular  experiment  depends  on  many  factors  and  it  is 
usually  advantageous  to  have  data  at  several  frequency  bands. 


6.3.2    Sensitivity  and  Its  Consequences  for  EPR  Measurements 

The  population  difference  between  the  ms  =+l/2  and  ms  =  -l/2  states,  which  is 
required  to  observe  the  absorption  of  radiation,  plays  a  crucial  role  in  the  design  of 
EPR  spectrometers.  At  thermal  equilibrium  the  population  difference  is  governed 
by  the  Boltzmann  distribution: 


where  Na  and  Np  refer  to  the  populations  of  the  ms  =+l/2  and  -1/2  states,  respec- 
tively, and  AE=hv=g^JB0.  At  298  K  and  B0  =  033  T  (v=925  GHz,  X-band), 
NJNfiis  0.998,  whichis  given  as  AN/N=  10" 3  where  AN=(Nfi-Na)  mdN=(Na+Nfi). 
In  comparison,  transitions  in  the  visible  region  have  AN/N&  1.  Because  the  transi- 
tion probability  is  directly  proportional  to  AN,  EPR  is  a  much  less  sensitive 


(6.7) 
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technique  than  is  optical  spectroscopy.  On  the  other  hand,  AN/N=5x  10~5  for  pro- 
ton NMR  spectroscopy,  which  is  much  less  sensitive  than  EPR.  As  shown  in  (6.7), 
the  population  difference  and  hence,  the  transition  probability  can  be  increased  by 
either  lowering  the  temperature  or  increasing  the  magnetic  field  strength.  With 
#o  =  0.33  T,  about  a  20-fold  increase  in  signal  strength  is  obtained  by  lowering  the 
temperature  from  298  to  10  K  and  a  further  tenfold  increase  is  obtained  by  going  to 
1  K.  Temperatures  to  4.2  K  are  possible  with  the  use  of  liquid  He  gas-flow  cooling 
systems  whereas  temperatures  to  1.6  K  require  a  specialized  cryostat  and  pumped 
He.  Increasing  the  magnetic  field  strength  from  0.33  T  (X-band)  to  1.2  T  (Q-band) 
at  10  K  results  in  a  3. 5 -fold  increase  in  the  signal  strength.  A  tenfold  increase  in 
signal  strength  is  obtained  at  3.3  T  (W-band)  compared  to  X-band.  However,  the 
dimensions  of  the  components  scale  with  the  wavelength  of  the  microwave,  and 
hence  the  sample  size  becomes  smaller  as  the  frequency  increases.  This  can  be  an 
advantage  when  dealing  with  biological  samples  that  are  difficult  to  prepare  but  it 
also  means  that  the  total  number  of  spins  in  the  sample  is  lower.  Another  drawback 
is  the  increased  cost  of  the  microwave  components  and  the  magnets  at  higher  fields. 
At  very  high  frequencies,  quasi-optical  technologies  must  be  used. 

As  a  consequence  of  the  small  Boltzmann  factors  involved  in  EPR  experiments, 
direct  measurement  of  the  microwave  absorption  is  hampered  by  very  poor  signal  to 
noise.  To  overcome  this  problem,  the  absorption  is  measured  indirectly  using  a  reso- 
nator and  a  microwave  bridge.  There  are  many  different  possible  designs  for  the 
resonator  but  most  commonly  it  is  a  closed  cavity  lined  with  highly  conductive 
metal  with  dimensions  similar  to  that  of  the  wavelength  of  the  microwave  radiation. 
It  is  the  microwave  equivalent  of  a  tuned,  resonant  circuit;  hence,  it  stores  energy  at 
its  resonance  frequency.  The  microwave  bridge  is  coupled  to  the  resonator  and 
tuned  to  its  resonance  frequency  so  that  all  of  the  incoming  microwave  energy  is 
stored.  When  an  EPR  transition  occurs  in  the  sample,  the  resonator  is  perturbed  and 
some  of  the  incoming  power  from  the  microwave  bridge  is  reflected.  The  amount  of 
reflected  power  is  considerably  larger  than  the  amount  absorbed  by  the  EPR  transi- 
tion and  so  a  larger  signal  is  obtained  than  in  a  direct  absorption  experiment. 


6.3.3    Resonator  Design  and  the  Problem  of  Damping 

In  an  ideal  resonant  cavity,  the  electrical  properties  of  conductors  dictate  that  the 
tangential  electric  field  component  of  the  microwaves  must  be  zero  at  the  walls  if 
they  are  made  of  a  highly  conductive  material  [8].  This  constrains  the  microwaves 
in  space  and  causes  the  formation  of  standing  waves.  The  result  is  a  separation  of 
the  magnetic  field  lines  of  force  from  the  electric  field  lines  of  force  into  different 
regions  of  the  resonant  cavity.  This  separation  satisfies  three  important  requirements 
for  an  EPR  experiment.  First,  because  only  the  magnetic  component  of  the  micro- 
wave radiation  promotes  EPR  transitions,  and  because  the  amplitude  of  the  detected 
signal  is  proportional  to  the  amount  of  energy  absorbed,  the  sample  should  be 
located  in  the  region  of  the  highest  magnetic  field  lines  of  force.  Second,  absorption 
of  microwaves  via  electric-dipole  transitions  in  the  sample  should  be  minimized  and 
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therefore  the  sample  should  be  located  in  the  region  of  the  lowest  electric  field  lines 
of  force.  Third,  to  satisfy  the  selection  rules  for  the  spin  transitions,  the  microwave 
magnetic  field  must  be  perpendicular  to  the  direction  of  the  laboratory  magnetic 
field.  Note  that  this  requirement  applies  when  the  Zeeman  interaction  dominates, 
which  is  the  usual  situation.  However,  for  integer  (S=  1,  2,  3,  etc.)  spin  systems  in 
which  the  zero-field  splitting  is  larger  than  the  Zeeman  splitting  (non-Kramer's  sys- 
tems) the  magnetic  component  of  the  microwave  radiation  should  be  parallel  to  the 
direction  of  the  laboratory  magnetic  field  [9] .  These  conditions  can  be  readily  met 
in  a  rectangular  resonator  as  depicted  in  Fig.  6.4.  When  the  sample  is  placed  in  the 
center  of  the  resonator,  it  is  exposed  maximally  to  the  magnetic  lines  of  force  and 
minimally  to  the  electric  lines  of  force.  A  rather  serious  complication  for  biological 
materials  is  that  if  the  sample  is  in  a  high  dielectric  solvent  such  as  water,  the  elec- 
tric component  of  the  microwaves  will  be  quenched,  and  along  with  it,  the  magnetic 
component.  This  can  be  avoided  in  one  of  two  ways.  First,  the  sample  can  be  placed 
in  a  quartz  tube  and  frozen;  the  phase  change  lowers  the  dielectric  constant  of  water 
from  80.1  at  20  °C  to  4  at  0  °C.  Second,  if  the  sample  needs  to  remain  a  liquid,  the 
so-called  flat  cell  can  be  used  which  places  the  high  dielectric  sample  in  the  nodal 
plane  of  the  electric  field.  Alternatively,  inexpensive,  small  diameter  capillaries  can 
be  used  at  both  X-  and  Q-bands,  with  a  considerable  savings  in  sample  volume  and 
ease  of  filling  and  cleaning. 


6.3.4  Relaxation 

Another  important  factor  governing  the  sensitivity  of  EPR  spectroscopy  is  spin  relax- 
ation. In  a  continuous  wave  experiment  microwave  power  is  applied  continuously  to 
the  sample  and  both  absorption  and  stimulated  emission  occur  between  the  spin 
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states.  Both  the  applied  microwave  field  and  oscillating  fields  in  the  local  environ- 
ment of  the  paramagnetic  center  induce  these  transitions.  The  relative  rates  of  absorp- 
tion and  emission  determine  the  net  absorption  of  energy  by  the  sample  and  hence  the 
amplitude  of  the  EPR  signal.  If  the  spin  system  is  perturbed  from  thermal  equilib- 
rium, the  transitions  induced  by  the  local  fields  return  the  system  to  equilibrium  with 
a  characteristic  time  known  as  the  spin-lattice  relaxation  time,  7\.  In  the  rotating 
frame  picture,  Tx  is  the  time  constant  associated  with  the  return  of  the  magnetization 
to  the  z-direction.  In  the  absence  of  relaxation,  the  populations  of  the  spin  states 
would  eventually  become  equal  under  continuous  irradiation  and  hence  no  net 
absorption  would  be  observed.  In  the  presence  of  relaxation,  however,  the  population 
difference  is  maintained.  The  size  of  the  population  difference  depends  on  the  relative 
rates  of  relaxation,  absorption,  and  stimulated  emission.  As  the  microwave  power  is 
increased,  the  rate  of  absorption  and  stimulated  emission  can  become  faster  than  the 
relaxation  rate  and  equalization  of  the  populations,  and  hence  loss  of  signal,  occurs. 
This  phenomenon  is  known  as  power  saturation  and  tends  occur  when  Tx  is  long. 
Since  Tx  usually  increases  as  the  temperature  is  lowered,  power  saturation  occurs 
more  easily  at  cryogenic  temperatures.  Thus,  slow  relaxing  systems  such  as  organic 
radicals  are  best  observed  at  low  microwave  power  and  higher  temperatures. 

The  relaxation  rate  also  has  an  influence  on  the  width  of  absorption  lines.  Here, 
both  spin-lattice  relaxation  and  spin-spin  or  phase  relaxation  governed  by  T2  play  a 
role.  In  EPR  the  resonant  absorption  of  microwaves  is  damped  by  Tx  and  T2  relax- 
ation, and  the  larger  the  damping  the  broader  the  spectrum.  Thus,  if  the  relaxation  is 
fast,  the  width  of  the  spectrum  is  large,  often  to  the  point  that  it  cannot  be  detected. 
Considered  quantum  mechanically,  the  phenomenon  of  lifetime  broadening  can  be 
traced  to  the  Heisenberg  uncertainty  principle,  which  can  be  restated  as  T8E>hl2n, 
where  r  is  the  mean  lifetime  and  8E  is  the  uncertainty  in  the  energy  of  the  system. 
As  the  lifetime  becomes  shorter,  8E  becomes  greater.  Thus,  as  the  relaxation 
becomes  faster,  the  probability  that  a  transition  will  be  distributed  over  a  larger 
magnetic  field  range  increases.  In  general,  the  linewidth  has  two  contributions:  (1) 
the  inhomogeneous  linewidth,  which  is  due  to  a  distribution  of  local  environments 
and  (2)  the  homogenous  linewidth  associated  with  the  intrinsic  properties  of  the 
paramagnetic  species,  which  defines  a  minimum  linewidth  for  a  given  spin  system. 
Lifetime  broadening  usually  determines  the  homogeneous  linewidth  and  it  is  par- 
ticularly problematic  for  certain  metal  ions.  For  example,  [4Fe-4S]  clusters  typi- 
cally cannot  be  observed  at  temperatures  much  above  20  K,  although  [2Fe-2S] 
clusters  can  be  observed  at  temperatures  as  high  as  77  K. 

A  plot  of  signal  amplitude  vs.  inverse  temperature  (Fig.  6.5)  shows  two  regions. 
The  rise  in  signal  amplitude  as  the  temperature  is  lowered  is  termed  the  Curie  law 
region  and  is  due  to  the  increased  population  difference  between  the  spin  energy 
levels  as  the  temperature  is  lowered.  The  decline  in  signal  amplitude  beyond  a  cer- 
tain optimum  is  due  to  power  saturation  as  spin-lattice  and  spin-spin  relaxation 
times  become  longer  as  the  temperature  is  lowered.  The  shape  of  the  saturation 
curve  at  low  temperature  depends  on  the  relaxation  mechanism. 

A  more  quantitative  approach  [10]  to  determine  the  onset  of  saturation  is  to  plot 
/  /  \[P  vs.  \I~P  for  a  given  temperature,  where  /  is  the  signal  intensity  and  P  is  the 
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Fig.  6.5  Temperature 
dependence  of  the  intensity 
of  an  EPR  signal.  At  high 
temperature  the  intensity 
follows  Curie-law  behavior. 
At  low  temperature  the  signal 
intensity  decreases  due  to 
power  saturation  as  Tx 
becomes  long 
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microwave  power  (Fig.  6.6).  In  the  non-saturating  region,  (low  power)  the  signal 
intensity  is  proportional  to  the  square  root  of  the  power  and  the  plot  is  horizontal.  At 
the  onset  of  power  saturation,  the  plot  shows  a  break  and  a  downward  slope.  The 
point  at  which  /  /  4p  drops  by  50  %  is  labeled  PV2,  and  is  particularly  useful  in 
identifying  interactions  between  two  spin  systems.  If  a  rapidly  relaxing  spin  system 
is  in  proximity  of  a  slower  relaxing  spin  system,  the  former  will  enhance  the  relax- 
ation of  the  latter,  and  this  can  be  measured  by  a  change  in  the  Pm  of  the  slower  spin 
system. 
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6.3.5    Conventional  Field-Swept  CW  EPR  and  Field 
Modulation  Lock-In  Detection 

To  obtain  an  EPR  spectrum,  the  microwave  absorption  must  be  measured  as  either 
the  microwave  frequency  or  the  magnetic  field  is  swept  over  a  finite  range.  When  a 
resonator  is  used  for  detection,  the  microwave  frequency  can  only  be  varied  over  a 
range  that  is  small  compared  to  the  bandwidth  of  the  resonator.  For  many  samples, 
the  width  of  the  EPR  spectrum  is  much  larger  than  the  bandwidth  of  the  resonator. 
Thus,  the  field  must  be  swept.  Despite  the  use  of  a  resonator,  EPR  signals  are  still 
very  weak  and  difficult  to  detect.  The  problem  arises  because  of  the  characteristics 
of  Iff  noise;  the  issue  is  that  direct  detection  of  the  signal  falls  in  the  noisy  DC 
region  of  the  electromagnetic  spectrum.  The  SIN  can  be  improved  by  converting  the 
DC  signal  to  an  AC  signal  that  is  in  a  quieter  region  of  the  spectrum,  applying  a 
narrow  bandpass  filter  to  reject  noise  above  and  below  that  frequency,  and  demodu- 
lating the  signal  to  bring  it  back  to  DC  (actually,  a  time- varying,  low  frequency 
signal).  This  scheme  is  termed  "phase- sensitive"  detection  and  employs  what  is 
known  as  a  "lock-in"  amplifier.  In  practice,  the  AC  frequency  is  applied  with  a  small 
set  of  modulation  coils  attached  to  the  resonator.  The  coils  modulate  the  static  mag- 
netic field  by  as  much  as  several  mT,  typically  at  a  frequency  of  100  kHz,  while  B0 
is  swept  slowly  over  a  fixed  range  (Fig.  6.7).  The  height  of  the  modulated  signal  is 
proportional  to  the  slope  of  the  absorption  line  and  it  oscillates  in  phase  and  1 80° 
out-of-phase  with  field  modulation  when  the  slope  is  positive  and  negative,  respec- 
tively. Thus,  the  phase-sensitive  detection  produces  a  lineshape  that  is  the  first  deriv- 
ative of  the  absorption  line.  The  absorption  spectrum  can  be  reconstructed  by 
integrating  the  spectrum  obtained  using  field  modulation,  but  because  EPR  signals 
are  usually  broad,  the  derivative  signal  is  normally  retained,  as  it  reveals  subtle  fea- 
tures that  would  be  less  pronounced  in  the  integrated  signal.  Figure  6.8  shows  the 
relationship  between  an  absorptive  signal  (Fig.  6.8a)  and  its  derivative  (Fig.  6.8b), 
and  in  the  case  of  a  more  complex  signal  (Fig.  6.8c,  d)  how  the  derivative  brings  out 
features  that  would  be  difficult  to  detect  in  the  absorption  spectrum. 


6.3.6   Pulsed  EPR 


In  magnetic  resonance  experiments,  the  use  of  pulses  provides  an  attractive  alter- 
native to  sweeping  the  field  or  frequency.  When  a  short  pulse  is  applied  to  the 
sample,  the  spin  system  is  brought  out  of  equilibrium.  Then,  as  the  spin  system 
relaxes  back  to  thermal  equilibrium,  it  reemits  some  of  the  radiation  as  a  free- 
induction  decay  (FID).  The  Fourier  transform  of  the  FID  corresponds  to  the  absorp- 
tion spectrum.  One  of  the  many  advantages  of  carrying  out  the  experiment  in  this 
way  is  that  by  collecting  all  of  the  frequencies  simultaneously,  the  measurement 
time  is  shorter  and  hence  less  noise  is  collected,  which  is  the  so-called  multiplex 
advantage.  In  addition,  multiple  pulses  can  be  used  to  manipulate  the  spin  system 
to  obtain  information  about  specific  interactions.  These  methods  are  standard 
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Fig.  6.7  Field  modulation  of  an  EPR  absorption  line 
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Fig.  6.8  Comparison  between  absortion  lineshapes  and  their  first  derivatives  as  detected  using 
field  modulation,  (a)  A  single  absorption  line,  (b)  Its  first  derivative,  (c)  A  broadened  1:2:2:1  quar- 
tet, (d)  Its  first  derivative 
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practice  in  NMR  spectroscopy  but  significant  obstacles  are  encountered  when  they 
are  applied  to  EPR. 

The  first  problem  is  that  of  the  excitation  bandwidth.  The  range  of  frequencies 
contained  within  a  pulse  is  inversely  proportional  to  the  pulse  length.  Thus,  the 
broader  the  required  frequency  range,  the  shorter  the  pulses  must  be.  The  shortest 
microwave  pulses  that  can  be  easily  produced  are  on  the  order  of  a  few  nanosec- 
onds. This  is  sufficient  to  excite  a  spectral  width  of  several  hundred  MHz  or  -10  mT. 
However,  many  EPR  spectra  are  much  broader  than  this.  The  second  problem  is  that 
as  the  pulse  length  is  shortened  and  the  excitation  bandwidth  increases,  the  amount 
of  power  at  any  given  frequency  decreases.  As  a  result,  high  power  pulses  must  be 
used  to  achieve  sufficient  intensity  across  the  excitation  bandwidth.  Finally,  the 
spectrometer  deadtime  is  also  problematic.  Following  a  short  high  intensity  micro- 
wave pulse,  the  microwave  cavity  must  "ring  down"  before  the  signal  can  be  mea- 
sured. The  ring-down  time  can  be  longer  than  the  decay  time  of  the  FID  depending 
on  the  sample  and  type  of  resonator  used.  Because  of  these  constraints,  single  pulse 
Fourier  transform  methods  are  only  useful  for  systems  with  very  sharp  lines  and  a 
relatively  narrow  overall  spectral  width.  The  EPR  spectra  of  most  biological  sam- 
ples do  not  fall  into  this  category.  However,  it  is  possible  to  study  systems  with 
broad  spectra  by  pulsed  EPR  if  spin-echo  methods  are  used. 

By  applying  two  pulses  separated  by  a  time,  r,  the  magnetization  of  the  sample 
can  be  refocused  to  generate  a  spin-echo  at  a  time  r  after  the  second  pulse  as  shown 
in  Fig.  6.9a.  The  lengths  of  the  two  pulses  are  chosen  so  that  the  magnetization  is 
rotated  through  90°  and  180°,  respectively.  As  can  be  seen  in  the  vector  diagrams  in 
Fig.  6.9b,  the  first  pulse  rotates  the  magnetization  into  the  x'y'  plane.  Different  spin 
packets  in  the  sample  have  different  precession  frequencies  and  so  their  contribu- 
tions to  the  magnetization  fan  out.  The  second  pulse  inverts  the  magnetization  so 
that  the  precession  refocuses  the  magnetization  at  time  2t.  There  are  many  other 
pulse  sequences  that  also  produce  echoes  but  they  will  not  be  described  here  and 
readers  are  referred  to  [5]  for  details. 

Several  basic  types  of  pulsed  EPR  experiments  can  be  performed  using  spin- 
echo  detection.  The  first  is  the  so-called  field  swept  echo  spectrum.  Usually,  the 
width  of  the  EPR  spectrum  is  much  larger  than  the  excitation  bandwidth  so  that  the 
height  of  the  echo  varies  depending  on  the  value  of  the  magnetic  field  at  which  it  is 
measured.  If  the  height  of  the  echo  is  plotted  as  a  function  of  the  magnetic  field,  the 
EPR  absorption  spectrum  is  obtained.  This  method  is  particularly  useful  when  two 
species  with  different  linewidths  are  present  in  the  sample.  If  field  modulation  is 
used  in  such  cases,  the  relative  amplitudes  of  the  two  components  depend  on  the 
modulation  amplitude.  Hence,  it  is  difficult  to  determine  their  relative  intensities  in 
a  single  experiment.  This  problem  does  not  occur  in  the  field  swept  echo  spectrum 
provided  that  the  same  fraction  of  the  magnetization  can  be  refocused  for  both  spe- 
cies pulse  sequence  for  the  four-pulse  DEER  experiment. 

In  the  second  class  of  experiments,  the  height  of  the  echo  is  measured  as  a  func- 
tion of  one  or  more  of  the  delay  times  or  pulse  lengths  in  the  pulse  sequence.  It  is 
these  types  of  experiments  for  which  pulsed  EPR  is  particularly  useful  because 
specific  interactions  in  the  spin  system  can  be  measured  independently.  A  complete 
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list  of  the  various  pulse  experiments  goes  beyond  the  scope  of  this  chapter  and  we 
summarize  only  the  few  most  common  sequences  here. 

Relaxation:  The  spin-lattice  relaxation  time  can  be  measured  by  the  inversion 
recovery  sequence  (n) -td- {nil) -t-(tz)-t- echo.  The  first  pulse  inverts  the  mag- 
netization then  the  second  and  third  pulses  generate  an  echo  that  probes  the  magne- 
tization as  it  relaxes.  The  relaxation  times  are  determined  by  measuring  the  height 
of  the  echo  as  a  function  of  the  delay  time  td.  The  relaxation  time  T2  can  be  obtained 
by  applying  the  Hahn  echo  sequence  followed  by  a  train  of  180°  pulses  (nl2)x-  [t- 
(tt^-t- echo- }n.  This  is  the  Carr-Purcell-Meiboom-Gill  sequence  [11,  12]  and  the 
height  of  the  echo  decays  with  T2  as  a  function  of  the  time  of  the  echo  after  the  first 
pulse. 

Hyperfine  coupling:  In  a  magnetic  field  the  nuclei  in  a  paramagnetic  compound 
experience  both  the  external  magnetic  field  and  a  local  field  caused  by  the  magnetic 
moment  of  the  unpaired  electron(s).  In  a  pulsed  EPR  experiment  the  microwave 
pulses  reorient  the  local  field  suddenly,  which  leads  to  coherent  oscillation  of  the 
nuclear  spin.  As  a  result,  the  height  of  the  echo  in  the  Hahn  sequence  (kIT)-t-{k) 
is  modulated  as  a  function  of  the  delay  time,  r.  Taking  the  Fourier  transform  of  the 
modulation  gives  an  electron  spin-echo  envelope  modulation  (ESEEM)  [13]  spec- 
trum from  which  hyperfine  couplings  and  the  types  of  coupled  nuclei  can  be  obtained. 
Hyperfine  sublevel  correlation  spectroscopy  (HYSCORE)  [14]  is  a  two-dimensional 
variation  of  this  method.  The  pulse  sequence  (nil)  -  t  -  (nl2)  -tx-  {ri)  -t2-  (nl2)  -  t  - 
echo  is  applied  and  the  echo  height  is  measured  as  a  function  of  tx  and  t2.  Double 
Fourier  transformation  of  the  data  gives  a  two-dimensional  dataset,  which  can  be 
plotted  as  a  contour  plot.  Such  contour  plots  are  widely  used  to  determine  the  hyper- 
fine couplings  of  cof actors  in  proteins  and  probe  the  protein-cofactor  interactions. 
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ESEEM  and  HYSCORE  spectroscopy  are  relatively  recent  developments  and 
electron-nuclear  double  resonance  (ENDOR)  is  the  more  established  method  [6,  15, 
16]  for  measuring  hyperfine  couplings.  In  an  ENDOR  experiment,  radio  frequency 
(RF)  irradiation  is  used  to  manipulate  the  nuclear  spins  while  microwaves  are  applied 
to  observe  the  electron  spins.  Changes  in  the  EPR  signal  of  the  electron  spins  are 
monitored  as  the  frequency  of  the  applied  RF  is  varied.  The  original  ENDOR  experi- 
ments [17]  were  carried  out  using  continuous  wave  methods;  however,  pulse  meth- 
ods [18-20]  are  more  common  today.  There  are  several  different  pulse  sequences 
that  can  be  used  but  in  all  of  them  a  preparation  pulse  is  applied  first  to  the  electron 
spins.  A  frequency  selective  RF  pulse  is  then  applied  to  the  nuclear  spins  and  finally 
an  echo  sequence  is  applied  to  the  electron  spins.  The  height  of  the  echo  is  measured 
as  a  function  of  the  frequency  of  the  RF  pulse  to  generate  the  ENDOR  spectrum. 

Electronic  spin-spin  coupling:  In  systems  containing  more  than  one  paramagnetic 
center,  the  electron  magnetic  moments  couple  to  one  another  through  the  exchange 
interaction  and  the  dipolar  interaction.  The  exchange  coupling  shifts  the  energies  of 
the  possible  spin  configurations  relative  to  one  another  as  a  result  of  the  Pauli  prin- 
ciple. The  strength  of  the  exchange  coupling  drops  off  exponentially  as  the  distance 
between  the  centers  increases.  The  dipolar  coupling  is  the  interaction  of  each  of  the 
spins  with  the  magnetic  field  produced  by  the  other  spin.  Importantly,  the  dipolar 
coupling  depends  on  the  inverse  cube  of  the  distance  between  the  paramagnetic 
centers.  For  distances  greater  than  ~1  nm,  the  dipolar  coupling  dominates  and  mea- 
suring the  spin-spin  coupling  in  such  systems  can  be  very  useful  for  structure  deter- 
mination. There  are  two  common  methods  for  determining  the  spin-spin  coupling. 
In  double  electron  electron  resonance  (DEER)  [21]  or  pulsed  electron  double  reso- 
nance (PELDOR)  [22]  two  different  microwave  frequencies  are  used.  There  are 
several  pulse  schemes  that  can  be  used  for  the  experiment  but  all  of  them  have  an 
echo  sequence  consisting  of  a  preparation  sequence,  a  mixing  period  and  a  detection 
pulse  that  is  applied  to  one  radical  and  an  inversion  pulse  at  a  different  frequency 
that  is  applied  to  the  other  radical  during  the  mixing  period.  The  four-pulse  DEER 
sequence  shown  in  Fig.  6.10  is  used  most  widely  for  distance  determination 

In  photo  synthetic  reactions  centers  and  the  cryptochrome/photolyase  family  of 
proteins,  light  excitation  leads  to  sudden  generation  a  radical  pair  from  the  excited 
singlet  state  of  a  chlorophyll  or  flavin.  When  the  Hahn  echo  sequence  is  applied  to 
radical  pairs  generated  in  this  way,  the  echo  is  formed  90°  out-of-phase,  i.e.,  parallel 
to  the  direction  of  the  microwave  radiation  rather  than  perpendicular  to  it  [23].  This 
out-of-phase  echo  shows  deep  modulation  as  a  function  of  the  pulse  separation  in 
the  echo  sequence,  from  which  the  spin-spin  coupling  can  be  obtained  [24] . 


6.3. 7    Transient  EPR 

Most  photoreactions  generate  short-lived  paramagnetic  species  and  time-resolved 
EPR  methods  can  be  extremely  useful  for  studying  such  processes  because  they 
provide  kinetic  rates  as  well  as  detailed  information  about  the  geometry  and  local 
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Fig.  6.10  DEER  or 
PELDOR  pulse  scheme 
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Fig.  6.11  The  transient  EPR  experiment 


environment.  These  methods  fall  into  two  general  classes:  pulsed  methods  and 
transient  EPR.  In  transient  EPR  continuous  microwave  irradiation  is  used  and  the 
EPR  response  to  pulsed  light  excitation  of  the  sample  is  monitored  at  a  fixed  mag- 
netic field  as  illustrated  in  Fig.  6.11. 

At  a  given  magnetic  field  a  transient  response  is  obtained  and  the  transients 
collected  over  a  range  of  magnetic  field  strengths  can  be  assembled  to  create  a 
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time/field  dataset.  Transient  EPR  spectra  can  be  extracted  from  the  dataset  by  plot- 
ting the  average  signal  in  a  given  time  window  against  magnetic  field.  An  important 
parameter  in  such  experiments  is  the  response  time.  If  the  signal  is  measured  using 
lock-in  detection,  the  frequency  of  the  field  modulation  and  the  bandwidth  of  the 
lock-in  amplifier  limit  the  rise  time  to  -100  |is.  Some  improvement,  in  the  rise  time 
can  be  obtained  by  using  higher  frequency  for  the  field  modulation  but  impedance 
problems  make  it  difficult  to  use  frequencies  much  above  1  MHz.  Thus,  transient 
EPR  data  are  generally  collected  without  using  field  modulation  in  what  is  referred 
to  as  "direct  detection."  Without  the  field  modulation  the  response  time  is  deter- 
mined by  the  bandwidth  of  the  resonator  and  the  preamplifier  and  is  typically 
-50  ns.  An  important  feature  of  transient  EPR  spectra  is  that  they  are  spin-polar- 
ized. Because  of  selection  rules  involved  in  the  photoreaction  the  spin  states  of  the 
paramagnetic  species  generated  by  the  laser  flash  are  usually  selectively  populated. 
As  a  result,  both  absorptive  and  emissive  signals  are  observed.  The  analysis  of  the 
polarization  pattern  can  be  useful  for  determining  the  pathway  by  which  the  para- 
magnetic species  has  been  generated. 


6.4    Characteristic  Lineshapes 

EPR  spectra  and  indeed  all  types  of  magnetic  resonance  spectra  show  characteristic 
lineshapes  depending  on  the  conditions  under  which  they  have  been  measured.  The 
nature  of  the  spectrum  depends  on  which  terms  contribute  to  the  Hamiltonian, 
which  is  governed  largely  by  motion  of  the  molecules  and  how  they  interact  with 
their  surroundings. 


6. 4. 1    Rapid  Tumbling  Regime 

In  dilute  solution  at  room  temperature,  the  solute  molecules  can  be  treated  as  being 
isolated  from  one  another  and  in  low  viscosity  solvents  the  correlation  time  for  rota- 
tion of  small  organic  molecules  is  on  the  order  of  10~10  s,  which  is  fast  enough  to 
cause  the  interactions  in  the  Hamiltonian  to  be  averaged.  The  zero-field  splitting 
interaction  averages  to  zero  and  in  most  cases  the  Hamiltonian  contains  terms  for 
the  average  Zeeman  energy  of  the  electrons  and  nuclei  and  average  hyperfine  cou- 
pling between  the  electrons  and  nuclei.  To  a  good  approximation,  the  nuclear 
Zeeman  energy  does  not  change  for  EPR  transitions,  and  hence  need  not  be  consid- 
ered further.  Thus,  the  Hamiltonian  can  be  written  as: 


(6.8) 
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Magnetic  Field 

Fig.  6.12  Energy  level  diagram  for  a  system  consisting  of  a  single  unpaired  electron  and  an  7  =  1 
nucleus.  Left:  Splitting  due  to  the  electron  Zeeman  interaction  and  the  hyperfine  interaction.  Right: 
Field  swept  EPR  experiment 


where  the  sum  is  over  all  magnetic  nuclei  and  it  has  been  assumed  that  the  magnetic 
field  is  high  enough  to  make  the  Zeeman  term  much  larger  than  the  hyperfine  term. 
The  corresponding  energies  are: 

E  =  8isoMoms  +  YAimsimn '  (6-9) 

i 

where  ms  and  m7  are  the  quantum  numbers  for  the  z-components  of  electron  and 
nuclear  magnetic  moments,  respectively.  Figure  6.12  shows  the  energy  level  dia- 
gram for  a  single  unpaired  electron  coupled  to  a  single  7  =  1  nucleus  like  nitrogen  as 
found  in  nitroxide  radicals.  As  can  be  seen,  the  hyperfine  coupling  splits  each  of  the 
two  energy  levels  of  the  unpaired  electron  into  three  equally  spaced  levels  separated 
by  the  hyperfine  coupling  constant,  A.  The  selection  rules  for  the  transitions  are 
Am5=l,  Am7=0,  since  the  microwaves  flip  the  electron  spin  but  not  the  nuclear 
spins.  Thus  the  energies  of  the  EPR  lines  are: 


E  =  hv=  glsJcB0  +  ^A,m 

i 


(6.10) 
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Fig.  6.13  Calculated  solution  spectrum  of  a  nitroxide  radical  showing  the  isotropic  g-factor  and 
hyperfine  couplings.  The  couplings  to  the  low  abundance  nuclei  15N  and  13C  have  not  been  included 

This  results  in  spectra  with  sharp  lines  centered  around  BQ=hv/giso/3e.  Note  that 
the  hyperfine  term  does  not  include  the  laboratory  magnetic  field  (which  has  no 
effect  on  the  magnetic  coupling  of  the  fixed  nuclear  spin  with  the  fixed  electron 
spin).  Because  mi=I,  I- 1,  . . .,  -/+ 1,  -I,  where  I  is  the  nuclear  spin,  each  nucleus 
splits  the  signal  into  27+ 1  components.  A  group  of  n  equivalent  nuclei  splits  the 

signal  into  2nl+ 1  components.  The  total  number  of  lines  is  ntotal  =  J^n2ft./.  + 1).  In 

i 

a  field  swept  experiment  (Fig.  6.12,  right),  the  separation  between  lines  that  differ 
by  one  in  mI  is  the  hyperfine  splitting  constant  and  is  equal  to  a=A/gisofie  with  A  in 
energy  units.  Usually,  the  hyperfine  coupling  constant  (hfc),  A,  and  the  hyperfine 
splitting  constant,  a  are  specified  in  MHz  and  mT,  respectively.  In  these  units, 
a  =  hA/gisofie  with  h/fie= 0.07 144773  mT  MHz-1.  An  example  of  the  solution  spec- 
trum of  a  nitroxide  radical  is  shown  in  Fig.  6.13.  As  can  be  seen,  the  1=  1  nitrogen 
nucleus  splits  the  EPR  signal  into  three  equal  intensity  lines.  Each  line  is  split  fur- 
ther into  a  13 -line  pattern  by  the  12  equivalent  methyl  protons  (1=  1/2).  For  groups 
of  equivalent  nuclei,  the  intensities  of  the  lines  are  the  coefficients  of  a  binomial 
expansion  that  can  be  determined  using  Pascal's  triangle  for  1=  1/2.  The  strength  of 
the  hyperfine  coupling  depends  on  the  spin  density  at  the  nucleus.  Hence,  the  nitro- 
gen splitting  is  much  larger  than  the  splitting  from  the  protons.  The  splitting  from 
the  remaining  protons  on  the  ring  is  too  small  to  be  observed. 
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Fig.  6.14  Demonstration  of  the  EPR  spectra  and  rotation  patterns  from  a  single  crystal,  (a) 
Calculated  EPR  spectra  for  the  rotation  of  the  crystal  about  one  of  the  crystal  axes,  (b)  Rotation 
pattern  of  the  line  positions.  The  crystal  is  assumed  to  have  two  paramagnetic  centers  per  unit  cell 


6.4.2    Rigid  Limit  Regime 

In  frozen  solution,  the  motion  becomes  too  slow  to  cause  averaging  of  the 
Hamiltonian  and  the  system  is  said  to  be  in  the  rigid  limit.  Indeed  many  biomole- 
cules  are  sufficiently  large  that  their  motion  is  in  the  rigid  limit  even  at  room  tem- 
perature in  liquid  solution.  The  slow  rate  of  tumbling  has  a  profound  impact  on  the 
EPR  spectrum.  This  is  because  essentially  all  of  the  interactions  that  contribute  to 
EPR  spectra  depend  on  the  orientation  of  the  molecule  with  respect  to  the  magnetic 
field.  If,  for  example,  a  protein  contains  a  paramagnetic  center  and  a  single  crystal 
of  the  protein  can  be  grown,  the  EPR  spectrum  changes  as  the  orientation  of  the 
crystal  is  changed  in  the  magnetic  field.  The  series  of  spectra  taken  as  the  crystal  is 
rotated  about  a  particular  axis  is  illustrated  in  Fig.  6.14a. 

The  positions  of  the  lines  in  the  spectra  give  effective  g-values  and  the  plot  of 
geff2((9)  against  6  is  known  as  a  rotation  pattern  (Fig.  6.14b)  and  can  be  described  by 
the  function: 

g2eff  (O)  =  ax  +  a2  cos(2#)  +  a3  sin  (20).  (6.11) 

A  fit  of  the  rotation  patterns  gives  values  for  au  a2,  and  a3.  Typically,  there  are 
several  molecules  in  symmetry-related  positions  in  the  unit  cell  and  the  values  of  au 
a2,  and  a3  obtained  from  them  are  not  independent.  Rotation  of  the  crystal  about 
three  mutually  perpendicular  axes  gives  six  independent  values,  which  can  be  writ- 
ten as  the  elements  of  a  symmetric  3x3  matrix.  This  matrix  is  known  as  the  g-tensor 
and  it  depends  on  the  choice  of  the  three  rotation  axes.  However,  there  exists  a 
unique  set  of  axes  known  as  the  principal  axes  in  which  the  g-tensor  is  diagonal. 
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Fig.  6.15  Examples  of  powder  patterns  arising  from  g-anisotropy.  Left:  Powder  patterns  as 
detected  in  absorption  (e.g.,  by  field-swept  echo  detection).  Right:  Corresponding  first  derivative 
spectra 

In  this  frame  of  reference,  the  diagonal  elements  are  known  as  the  principal  values. 
The  g-tensor  obtained  from  the  rotation  patterns  can  be  diagonalized  numerically  to 
give  the  principal  values,  gxx,  gyy,  and  gzz  and  the  angles  between  the  principal  axes 
x,  y,  and  z  and  the  chosen  rotation  axes.  If  the  orientation  of  the  molecules  in  the  unit 
cell  is  known,  then  the  rotation  patterns  allow  the  orientation  of  the  principal  axes 
of  the  g-tensor  in  the  molecule  to  be  determined. 

For  biological  samples,  it  is  relatively  rare  to  have  single  crystals  available  and 
experiments  are  usually  carried  out  on  unordered  samples,  e.g.,  frozen  solution.  In 
this  case,  the  EPR  spectrum  is  the  sum  of  the  spectra  arising  from  the  randomly 
oriented  molecules  in  the  sample.  This  random  distribution  gives  the  characteristic 
powder  patterns  shown  in  Fig.  6.15.  The  absorption  spectra  that  would  be  obtained 
in  a  field-swept  echo  experiment  are  shown  on  the  left  of  Fig.  6.15,  and  those 
obtained  using  field  modulation  are  shown  on  the  right.  From  these  spectra,  the 
principal  g-values  are  easily  determined  but  no  information  about  the  orientation  of 
the  principal  axes  is  obtained.  Three  possibilities  for  g-value  anisotropy  are  illus- 
trated in  Fig.  6.15.  When  the  paramagnetic  center  has  cubic  symmetry,  the  three 
principal  g-values  are  equivalent,  i.e.,  gxx=gyy=gzz  and  the  spectrum  is  a  single  sym- 
metrical line  (Fig.  6.15,  top)  and  is  said  to  be  "isotropic."  In  systems  with  ortho- 
rhombic  symmetry,  all  three  principal  g- values  are  inequivalent,  i.e.,  gxx^gyy^gzz 
and  the  spectrum  is  referred  to  as  "rhombic"  (Fig.  6.15,  bottom).  When  depicted  as 
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an  absorption  signal,  the  spectrum  has  a  low  field  turning  point,  a  midfield  peak,  and 
a  high  field  turning  point.  In  derivative  mode,  the  signal  appears  as  a  low  field  peak, 
a  midfield  derivative,  and  a  high  field  trough.  These  three  features  arise  from  mol- 
ecules with  their  x-,  y-,  and  z-axes  parallel  to  the  field,  respectively,  and  the  intensity 
observed  between  the  three  features  is  from  molecules  at  other  orientations.  Notice 
that  because  B0  is  inversely  proportional  to  the  g-value,  the  high  field  end  of  the 
spectrum  (largest  value  of  B0)  corresponds  to  the  smallest  of  the  three  principal 
g- values  and  the  low  field  end  of  the  spectrum  to  the  largest  principal  g- value.  In 
practice,  the  principal  g-values  are  obtained  as  the  maximum  and  the  inflection 
points  of  the  rising  and  falling  edges  of  the  absorption  spectrum.  Using  field  modu- 
lation, these  correspond  to  the  midpoint  of  the  derivative- shaped  signal  and  the 
centers  of  mass  of  the  positive  and  negative  peaks,  respectively. 

Paramagnetic  centers  with  threefold  or  higher  rotational  symmetry  about  a 
unique  symmetry  axis  are  said  to  have  axial  symmetry.  In  such  cases,  two  of  the 
principal  g-values  are  equal  such  that  gXx=gyy¥:gzz'  Here,  there  are  two  possible 
labeling  conventions  for  the  axes.  The  symmetry  axes  are  labeled  such  that  the  axis 
with  the  highest  rotational  symmetry  is  the  z-axis.  In  EPR,  the  axis  associated  with 
the  smallest  g-value  is  usually  labeled  z.  To  avoid  confusion  when  there  is  a  conflict 
between  these  two  schemes,  the  g-factor  associated  with  the  principal  axis  that  is 
parallel  to  symmetry  z-axis  is  often  labeled  gy  and  the  g-value  associated  with  the 
other  two  axes  is  labeled  g±.  Since  the  z-axis  has  a  higher  probability  of  being  ori- 
ented perpendicular  to  the  field  than  parallel  to  it,  the  signal  corresponding  to  g±  is 
more  intense  than  that  corresponding  to  gy.  Taking  the  case  where  gy  >g±,  this  means 
that  the  low  field  end  of  the  spectrum  is  weak  and  the  high  field  end  is  strong  as 
shown  in  (Fig.  6.15,  second  spectrum  from  top).  When  gy>g±  the  situation  is 
reversed  (Fig.  6.15,  third  spectrum  from  top). 

The  hyperfine  interaction  is  also  anisotropic  and  is  characterized  by  three  princi- 
pal values  Axx,  Ayy,  and  Azz.  The  powder  patterns  shown  in  Fig.  6.15  occur  when  the 
spectrum  is  dominated  by  the  Zeeman  interaction  and  the  hyperfine  interactions  (or 
any  other  interactions)  are  sufficiently  weak  that  they  contribute  only  to  the  line- 
width  at  a  given  orientation.  In  practice,  however,  resolved  hyperfine  splitting  is 
sometimes  observed.  In  such  cases,  the  principal  g-values  and  hyperfine  coupling 
constants  are  obtained  by  simulation  of  the  spectrum. 

For  systems  with  S>  1/2,  spin-orbit  coupling  and  the  dipolar  coupling  between 
the  electrons  lead  to  zero-field  splitting.  The  term  zero-field  splitting  refers  to  the 
fact  that  the  splitting  of  the  energy  levels  caused  by  these  interactions  does  not 
depend  on  the  magnetic  field  and  can  be  observed  by  optical  methods  at  zero  field. 
In  systems  with  weak  spin-orbit  coupling,  the  zero-field  splitting  is  generally  much 
larger  than  the  anisotropy  in  the  g-tensor  but  much  smaller  than  the  Zeeman  energy 
and  the  EPR  spectrum  is  dominated  by  this  splitting.  An  example  for  a  molecular 
triplet  state  (S=  1)  is  shown  in  Fig.  6.16.  In  this  case  the  triplet  state  gives  a  pair  of 
lines  for  each  orientation  and  the  resulting  powder  pattern  is  referred  to  as  a  Pake 
doublet.  From  the  positions  of  the  features  in  the  Pake  doublet,  the  zero-field  split- 
ting parameters  D  and  E  can  be  obtained. 
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Fig.  6.16  Pake  doublet  powder  pattern  arising  from  the  zero-field  splitting  of  a  triplet  state,  (a) 
Absorption  spectrum,  (b)  Corresponding  first  derivative  spectrum 


6.5    Applications  in  Biophysics 

6.5.1    Spin-Orbit  Coupling  and  g-  Value  Anisotropy 

In  the  paramagnetic  centers  found  in  biological  systems,  the  electrons  occupy 
atomic  and  molecular  orbitals,  and  as  a  result,  the  total  angular  momentum  consists 
of  the  inherent  spin  angular  momentum  plus  the  orbital  angular  momentum.  The 
latter  is  normally  zero  for  molecules  in  the  ground  state  (it  is  said  to  be  "quenched"), 
but  mixing  of  the  ground  and  excited  states  through  spin-orbit  coupling  allows  the 
ground  state  to  acquire  some  orbital  angular  momentum.  The  amount  of  spin-orbit 
coupling  depends  on  the  size  of  the  atom;  it  is  small  for  oxygen,  nitrogen,  and  car- 
bon, and  is  large  for  metals.  As  a  result  of  spin-orbit  coupling,  g=ge  +  Ag,  where  ge 
is  the  free-electron  g-factor  (2.0023193)  and  the  latter  is  a  correction  that  depends 
on  the  degree  of  spin-orbit  coupling.  Hence,  the  energy  to  induce  a  transition 
between  the  Zeeman  components  is  altered.  In  organic-free  radicals,  Ag  is  small, 
and  the  spectra  are  similar  to  that  of  a  free  electron,  whereas  in  metals,  Ag  can  be 
large  and  g  can  deviate  significantly  from  ge.  Because  orbitals  in  molecules  do  not 
have  spherical  symmetry,  the  magnitude  of  the  spin-orbit  coupling  is  direction- 
dependent,  i.e.,  the  admixture  is  anisotropic.  Hence,  the  contribution  to  the  mag- 
netic moment  from  orbital  motion  depends  on  the  orientation  in  an  external  field.  It 
is  this  orientation  dependence  of  the  magnetic  moment  that  leads  to  the  g-anisotropy 
and  the  rotation  patterns  and  powder  spectra  shown  in  Fig.  6.14a,  b. 


6    Electron  Paramagnetic  Resonance  Spectroscopy 


199 


Fig.  6.17  EPR  spectra  of 
three  organic  cofactor 
radicals  in  Photosystem  I. 
Dashed  line:  A0~ ,  radical 
anion  of  the  primary  electron 
acceptor;  dotted  line:  P70o+ 
radical  cation  of  the 
chlorophyll  special  pair 
donor,  solid  line:  Af 
phylloquinone  radical  anion. 
The  four  prominent  hyperfine 
couplings  from  the  ring 
methyl  group  are  indicated  in 
the  34  GHz  spectrum.  The 
relevant  g-value  scale  is 
depicted  above  each  spectrum 
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6.5.1.1    Organic  Cofactor  Radicals 

Spin-orbit  coupling  in  organic  radicals  (centered  on  C,  O,  S,  and  N)  is  typically 
weak,  hence  the  g-factor  anisotropy  is  small  and  the  resolution  at  X-band  is  often 
insufficient  to  obtain  the  components  of  the  g-tensor.  For  some  oxygen-containing 
species  such  as  tyrosine  radical  anions  or  semiquinones,  partial  resolution  of  the 
g-tensor  components  can  be  achieved  at  Q-band.  However,  in  general  high  field 
EPR  (W-band  or  higher)  is  required  to  resolve  the  g-anisotropy  in  organic  radicals. 
Figure  6.17  depicts  three  organic  radicals:  P700+  (a  chlorophyll  cation  radical),  A0~  (a 
chlorophyll  anion  radical),  and  A{~  (a  phylloquinone  anion  radical).  At  X-band,  the 
spectrum  of  A0~  appears  as  a  single  derivative  lineshape  with  a  g- value  of  2.003  and 
a  linewidth  of  0.12  mT.  P700+  is  a  special  pair  of  chlorophyll  molecules,  and  because 
the  spin  is  distributed  over  not  one  but  two  macrocycles,  the  linewidth  is  corre- 
spondingly narrower  (ideally  by  a  factor  of  V2  for  an  equal  distribution  of  the 
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spin).  It  is  characterized  by  a  derivative- shaped  signal  with  a  g- value  of  2.002  at  the 
crossover  point  and  a  linewidth  of  0.09  mT.  The  g-tensor  anisotropy  in  A{~  is  larger 
than  that  of  A0~  due  to  the  presence  of  electron  density  at  the  electronegative  oxygen 
atom.  In  quinone  radicals  the  g-anisotropy  is  primarily  a  result  of  mixing  of  the 
lone-pair  orbitals  of  the  oxygen  with  n*  orbitals  by  spin-orbit  coupling.  The  larger 
g-anisotropy  leads  to  an  X-band  spectrum  for  phylloquinone  that  is  characterized  by 
a  distorted  derivative  signal  with  a  g-value  of  2.0050  at  the  crossover  point  and  a 
linewidth  of  0.14  mT  (Fig.  6.17,  top).  At  Q-band,  the  resolution  is  enhanced  by  a 
factor  of  3.8  and  although  this  is  insufficient  to  resolve  the  g-tensor  components  of 
the  chlorophyll  anion  and  P70o+  radicals,  it  does  allow  a  more  precise  measurement 
of  their  average  g-factors  and  reveals  a  slight  difference  of  2.0035  for  A0~  and 
2.0028  for  P70o+-  Both  signals  still  appear  isotopic  due  to  the  large  number  of  over- 
lapping, unresolved  hyperfine  couplings  from  the  ring  protons  (see  below).  The 
Ar  phylloquinone  radical,  however,  shows  a  more  complicated  lineshape  that 
incorporates  a  contribution  from  a  partially  resolved  set  of  four  hyperfine  compo- 
nents from  the  three  equivalent  protons  of  the  ring  -CH3  group  (Fig.  6.17,  bottom). 
The  hyperfine  splitting  can  be  eliminated  by  substituting  the  hydrogen  atoms  with 
deuterium,  which  has  a  spin  1  nucleus  and  a  relatively  small  magnetic  moment 
because  of  its  higher  mass.  In  deuterated  samples,  a  near-rhombic  spectrum  with 
gxx= 2.0062,  gyy  =  2.0050,  and  gzz= 2.0021  is  obtained  for  the  Ar  radical  (Fig.  6.18). 
These  examples  additionally  show  that  spin-orbit  coupling  leads  to  a  downfield  shift 
of  gxx  and  gyy,  but  not  gzz,  as  the  magnetic  field  strength  increases. 

6.5.1.2    Metal  Centers  in  Iron  Proteins 

Compared  to  organic  radicals,  the  spin-orbit  coupling  in  transition  metal  centers  is 
large  due  to  the  presence  of  partially  filled  d-orbitals;  hence,  the  g-factor  anisotropy 
is  typically  spread  over  a  large  spectral  range.  A  microwave  frequency  of  9  GHz 
(X-band)  is  usually  sufficient  to  extract  the  principal  values  of  the  g-tensor.  The 
EPR  spectrum  of  the  reduced  [4Fe-4S]  FA  cluster  in  Photosystem  I  (PS  I),  for  exam- 
ple, shows  a  well-resolved  g-tensor  with  principal  g-values  of  2.07,  1.93,  and  1.89 
(Fig.  6.19). 


6.5.2    Hyperfine  Coupling  and  Spin-Density  Distributions 

Hyperfine  couplings  are  particularly  useful  for  studying  the  interaction  of  organic 
cofactors  with  the  surrounding  protein.  From  such  couplings  it  is  possible  to  detect 
the  presence  of  hydrogen  bonds  or  deduce  whether  the  unpaired  electron  is  local- 
ized on  the  cofactor  or  delocalized  onto  the  binding  site  or  over  several  cofactors. 
Such  conclusions  can  be  very  important  for  understanding  the  function  of  enzymes, 
particularly  those  that  perform  electron  transfer  reactions.  The  hyperfine  coupling  is 
composed  of  two  parts,  an  isotropic  Fermi  contact  term  and  an  anisotropic 
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Fig.  6.18  Q-bandEPR 
spectrum  of  per-deuterated 
phylloquinone  radical  anion 
(top)  compared  with  natural 
abundance  phylloquinone 
radical  anion  (bottom). 
Simulated  spectra  are  shown 
in  the  dotted  lines.  The 
principal  g-values  are 
depicted 
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dipole-dipole  interaction  term.  In  isotropic  solution  the  latter  term  averages  to  zero 
and  only  the  contribution  from  the  orientation-independent  Fermi  contact  term  is 
observed.  However,  for  most  biological  samples  experiments  are  carried  out  in  the 
rigid  limit  regime  and  both  terms  contribute.  The  magnitude  of  the  Fermi  contact 
term  ultimately  depends  on  the  unpaired  electron  density  from  the  Is  orbital  at  the 
nucleus.  The  influence  of  valence  electrons  occurs  through  a  mechanism  known  as 
core  polarization. 

The  primary  donors  and  quinone  acceptors  in  photo  synthetic  reaction  centers  are 
two  good  examples  of  how  hyperfine  couplings  can  be  used  to  deduce  details  of  the 
local  environment  of  organic  cof actors.  In  reaction  centers  of  purple  bacteria,  the 
chlorophylls  are  arranged  in  a  cofacial  dimer  and  the  hyperfine  couplings  have  been 
determined  in  single  crystals.  The  couplings  cannot  be  resolved  in  the  EPR  spec- 
trum and  ENDOR  is  needed  to  obtain  them.  ENDOR  spectra  from  single  crystal 
measurements  give  rotation  patterns  similar  to  those  shown  in  Fig.  6.14b  from 
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Fig.  6.19  Low  temperature 
X-band  spectrum  of  the 
[4Fe-4S]2-FA  cluster  in 
Photosystem  I.  The  resonance 
at  335  mT  is  due  to  P70o+.  The 
principal  g-values  are 
depicted 
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which  the  principal  values  and  principal  axis  orientations  of  the  hyperfine  coupling 
tensors  can  be  obtained.  The  rotation  patterns  of  the  primary  donor  cation  in 
Rhodobacter  sphaeroides  R-26  have  been  measured  [25]  and  the  couplings  and  axis 
orientations  allow  the  ENDOR  signals  to  be  assigned  to  individual  nuclei  in  each  of 
the  two  chlorophyll  molecules.  From  the  magnitudes  of  the  couplings,  a  spin  den- 
sity map  of  the  system  is  obtained  that  shows  higher  spin  density  on  one  half  of  the 
dimer.  The  asymmetry  of  the  spin  density  indicates  that  although  the  dimer  shows  a 
high  degree  of  symmetry  in  the  positions  of  the  atoms,  electronically  the  two  chlo- 
rophylls are  inequivalent. 

As  shown  in  Fig.  6.17,  partially  resolved  hyperfine  splitting  is  observed  in  the 
Q-band  spectrum  of  the  reduced  phylloquinone  acceptor  in  PS  I.  The  coupling  is  a 
direct  result  of  the  hydrogen  bonding  of  the  quinone  to  its  binding  site.  Several  stud- 
ies have  shown  that  the  splitting  is  due  to  the  protons  of  the  2-methyl  group  [26,  27] 
and  that  in  the  phylloquinone  binding  site  in  PS  I,  the  splitting  is  about  20  %  larger 
than  in  isopropanol  solution  [28].  This  difference  is  a  result  of  the  fact  that  in  protic 
solvents  both  oxygen  atoms  of  phylloquinone  are  hydrogen-bonded.  In  contrast,  in 
the  phylloquinone  binding  site  in  PS  I,  only  one  of  the  two  oxygens  is  H-bonded  to 
the  protein.  In  the  presence  of  asymmetric  H-bonding,  the  spin  density  on  the  qui- 
none headgroup  is  distorted  as  shown  in  Fig.  6.20.  If  the  oxygen  at  the  4-position  of 
the  ring  is  more  strongly  H-bonded  than  the  oxygen  in  the  1 -position,  e.g.,  if  there 
is  only  a  single  H-bond  (Fig.  6.20,  right)  the  distortion  increases  the  spin  density 
next  to  the  methyl  group  in  the  2-position.  Thus,  the  strength  of  the  methyl  hyper- 
fine coupling  can  be  used  to  probe  the  nature  of  the  hydrogen  bonding. 
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Fig.  6.20  Spin  density 
distribution  in  phylloquinone 
with  different  H-bonding 
configurations.  Left: 
Symmetric  H-bonding,  low 
spin  density  adjacent  to  the 
2-methyl  group.  Right: 
Asymmetric  H-bonding  high 
spin  density  adjacent  to  the 
2-methyl  group 
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6.5.3    Electron-Electron  Spin-Spin  Interactions 

6.5.3.1    Site-Directed  Spin  Labeling 

The  dipole-dipole  coupling  between  magnetic  moments  is  dependent  on  the  dis- 
tance between  the  magnetic  moments  and  is  widely  used  to  obtain  structural  infor- 
mation. In  NMR  the  Nuclear  Overhauser  Effect,  which  is  the  transfer  of  polarization 
between  nuclei  via  the  dipolar  interaction,  is  routinely  used  to  place  constraints  on 
the  structure  of  the  protein.  With  nuclear  spins,  distances  up  to  about  0.5  nm  can  be 
estimated.  If  electrons  are  measured,  the  much  larger  magnetic  moment  means  that 
it  is  possible  to  determine  distances  up  to  several  nm.  The  main  drawback,  however, 
is  the  lack  of  naturally  occurring  unpaired  electrons.  A  solution  to  this  problem  has 
been  found  by  using  the  reactivity  of  the  cysteine  sulfhydryl  group  towards  methane 
sulfonates  to  form  sulfide  bonds.  If  a  stable  nitroxide  radical  is  substituted  with  a 
methane  sulfonate  side  chain,  it  can  be  attached  to  cysteine-containing  proteins  so 
that  they  are  spin  labeled  at  specific  sites.  Using  point  mutagenesis  techniques,  it  is 
possible  to  introduce  cysteine  residues  at  specific  locations  and  thus  the  protein  can 
be  site- specifically  spin-labeled.  If  a  variant  of  the  protein  is  constructed  with  only 
two  cysteins,  two  spin  labels  can  be  introduced  and  the  distance  between  them  can 
then  be  determined  by  measuring  the  dipolar  coupling  between  the  unpaired  elec- 
trons. The  spin-spin  coupling  is  not  easily  determined  from  the  CW  EPR  spectra  of 
spin-labeled  proteins  but  can  be  obtained  most  accurately  from  the  modulation 
curves  in  DEER  or  PELDOR  measurements.  If  the  distance  between  the  two 
unpaired  electrons  is  large  compared  to  the  dimensions  of  the  distribution  of  the 
electrons,  the  point-dipole  approximation  can  be  used  and  the  dipolar  coupling  is 
related  to  the  distance  by: 

D  =  -  ^'^^  mT  nm3 .  (6.12) 
r 

Recent  examples  of  the  use  of  this  technique  include  studies  of  the  ATP  hydroly- 
sis cycle  in  ABC  transporters  [29]  and  of  the  self-association  of  the  histidine  kinase 
CheA  [30]. 
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6.5.3.2    Light-Induced  Radical  Pairs 

Weakly  coupled  two-electron  spin  systems  are  also  found  in  proteins  that  undergo 
light-induced  electron  transfer.  The  two  main  classes  of  such  proteins  are  the  pho- 
tosynthetic  reaction  centers  and  the  photolyase  and  chryptochrome  flavin  proteins. 
In  both  of  these  systems  one  or  more  chains  of  electron  transfer  cofactors  extend 
away  from  a  chromophore  (a  chlorophyll  dimer  or  a  flavin)  that  acts  as  either  an 
electron  donor  or  an  electron  acceptor  in  its  excited  state.  When  the  chromophore 
absorbs  light,  electron  transfer  generates  a  series  of  radical-ion  pairs.  By  passing 
electrons  rapidly  along  the  chain  of  cofactors,  the  charge  separation  is  stabilized. 
The  first  of  the  radical  pairs  that  can  be  detected  by  transient  EPR  is  weakly  cou- 
pled, that  is,  the  spin-spin  coupling  is  smaller  than  the  difference  in  the  resonance 
frequencies  of  the  two  radicals.  In  the  excited  state  of  the  chromophore  the  two 
electrons  are  strongly  correlated  because  they  reside  on  the  same  molecule.  The  fact 
that  the  radical  pairs  are  generated  from  this  highly  correlated  state  on  a  timescale 
that  is  short  compared  to  the  precession  and  relaxation  of  the  spins  means  that  they 
have  an  unusual  population  distribution  and  dynamic  properties.  Figure  6.21a  shows 
the  energy  level  diagram  and  the  EPR  transitions  in  a  weakly  coupled  radical  pair. 
Because  the  state  of  the  chromophore  from  which  the  radical  pair  is  generated  is 
either  a  singlet  state  or  a  triplet  state,  it  is  useful  to  express  the  states  of  the  radical 
pair  in  a  singlet-triplet  basis.  As  can  be  seen  in  Fig.  6.21a,  only  the  two  middle 
levels  have  singlet  character.  Thus,  if  the  electron  transfer  is  initiated  from  the 
excited  singlet  state  of  the  chromophore,  only  the  states  2  and  3  of  the  radical  pair 
are  populated  because  they  are  the  only  ones  with  any  singlet  character.  This  selec- 
tive population  of  the  radical  pair  spin  states  leads  to  both  absorptive  and  emissive 
transitions  in  the  direct  detection  mode  spectra  as  shown  in  Fig.  6.21b.  Such  a  spec- 
trum is  said  to  be  spin  polarized  and  the  pattern  of  absorptive  and  emissive  features 
is  called  a  polarization  pattern.  The  polarization  patterns  of  weakly  coupled  radical 
pairs  depend  on  the  internal  geometry  of  the  pair.  In  particular  they  are  sensitive  to 
the  orientation  of  the  g-tensors  of  the  two  radicals  relative  to  the  distance  vector 
between  them  [31].  This  orientation  dependence  has  been  widely  used  to  study  the 
quinone  acceptors  in  photo  synthetic  reaction  centers  of  purple  bacteria  and  PS  I 
[32] .  A  comparison  of  the  W-band  spectra  of  the  radical  pair  P+Q~  in  PS  I  and  purple 
bacterial  reaction  centers  is  shown  in  Fig.  6.22.  Here  P  stands  for  the  respective 
primary  donors  P700  and  P865  and  Q  for  the  quinone  acceptors  phylloquinone  and 
ubiquinone,  respectively.  The  differences  on  the  low  field  end  of  the  spectra  are 
primarily  due  to  the  different  orientations  of  the  quinones. 

The  selective  population  of  the  spin  states  in  light-induced  radical  pairs  also  has 
a  strong  influence  on  their  spin  echoes.  As  shown  in  the  single  orientation  EPR 
spectrum  (Fig.  6.21b,  left),  each  radical  contributes  the  so-called  antiphase  doublet 
split  by  the  spin-spin  coupling.  Overall  there  is  no  net  magnetization  of  the  sample. 
Under  these  conditions  when  the  Hahn  echo  sequence  is  applied,  the  spin-echo  is 
phase  shifted  by  90°  so  that  if  the  pulses  are  applied  along  the  x  direction  the  echo 
also  appears  in  the  x-direction  rather  than  in  the  j-direction.  As  illustrated  in 
Fig.  6.23,  a  dispersion  signal  is  obtained  in  the  in-phase  channel  (My)  while  the  echo 
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Fig.  6.21  Energy  level  diagram  and  spectra  of  a  weakly  coupled  radical  pair  formed  from  a  singlet 
precursor,  (a)  Energy  level  diagram  (b)  W-band  spectra 


Fig.  6.22  W-band  spectra  of   Photosystem  I 
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absorption  signal  is  observed  in  the  out-of-phase  channel  (Mx).  When  the  spacing 
between  the  two  pulses  in  the  echo  sequence  is  varied,  the  echo  amplitude  shows 
deep  modulations  that  depend  strongly  on  the  spin-spin  coupling.  This  dependence 
is  illustrated  in  the  bottom  part  of  Fig.  6.23,  which  shows  the  echo  height  plotted  as 
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Fig.  6.23  The  out-of -phase 
echo  modulation  experiment 
for  light-induced  singlet-born 
radical  pairs 


Time 


CD 
CO 
03 


(*). 


X 


M 


y 


h 


h 


d  =  0.12  mT 
d=  0.17  mT 


0 


500       1000       1500  2000 
t  /  ns 


a  function  of  the  pulse  spacing  for  values  of  the  dipolar  coupling  of  0.17  and 
0.12  mT.  As  can  be  seen  this  small  change  in  the  coupling,  which  corresponds  to  a 
change  of  0.3  nm  in  the  distance,  has  a  dramatic  effect  on  the  modulation  curve. 
Thus,  the  out-of-phase  echo  modulation  allows  distances  between  the  cof actors  in 
light-induced  radical  pairs  to  be  determined  to  an  accuracy  that  is  usually  better  than 
that  which  can  be  obtained  by  X-ray  crystallography.  A  drawback,  however,  is  that 
the  distance  obtained  is  between  the  centers  of  the  spin  density  and  the  positions  of 
these  points  relative  to  the  positions  of  the  atoms  is  not  always  unambiguous. 


6.5.3.3    Molecular  Triplet  States 

In  many  photoactive  proteins  triplet  states  of  chromophores  are  formed  that  can  also 
be  detected  by  time-resolved  EPR  methods.  In  the  excited  states  of  molecules,  the 
exchange  coupling  between  the  electrons  is  generally  orders  of  magnitude  larger 
than  the  Zeeman  energy  but  the  dipolar  coupling  is  typically  an  order  of  magnitude 
or  more  smaller  than  the  Zeeman  energy  depending  on  the  magnetic  field.  This 
results  in  EPR  spectra  that  are  dominated  by  the  zero-field  splitting  as  discussed 
above  with  Fig.  6.16. 

For  the  triplet  state  of  a  chromophore  to  be  populated,  a  mechanism  by  which  a 
spin-flip  can  occur  must  be  present.  There  are  two  such  mechanisms  that  commonly 
operate.  The  first  of  these  is  spin-orbit  coupling,  in  which  a  change  in  the  spin 
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Fig.  6.24  Spin-polarized  chlorophyll  triplet  state  spectra.  Left:  Calculated  spectra  for  different 
possible  pathways  to  the  triplet  state.  Right:  Experimental  spectrum  from  Photosystem  II  D1/D2 
particles  at  40  K 

angular  momentum  is  accompanied  by  a  change  in  the  orbital  angular  momentum 
such  that  the  total  angular  momentum  remains  unchanged.  The  second  is  singlet- 
triplet  mixing  that  can  occur  in  radical  pairs  due  to  the  different  precession  frequen- 
cies of  the  spins  of  the  two  radicals.  Because  of  this  mixing,  radical  pairs  formed 
initially  in  the  singlet  state  often  recombine  to  a  molecular  triplet  state.  In  both 
cases,  the  process  is  spin- selective  and  hence  the  triplet  state  is  spin-polarized. 
However,  the  spin  selectivity  is  different  in  the  two  cases  and  the  polarization  pat- 
terns obtained  are  not  the  same.  In  the  case  of  intersystem  crossing,  the  strength  of 
the  spin-orbit  coupling  is  orientation-dependent  and  is  governed  by  the  symmetry  of 
the  molecule.  Singlet-triplet  mixing  in  a  radical  pair,  on  the  other  hand,  depends  on 
the  singlet  character  of  the  radical  pair  spin  states.  For  a  weakly  coupled  radical  pair 
at  high  magnetic  field,  the  singlet  state  mixes  only  with  the  T0  state.  Thus,  charge 
recombination  occurs  exclusively  to  the  TQ  level  of  the  molecular  triplet  state  and 
the  polarization  does  not  depend  on  orientation.  Figure  6.24,  left  shows  the  polar- 
ization patterns  expected  for  charge  recombination  and  spin-orbit  coupling  medi- 
ated intersystem  crossing.  In  the  latter  case  it  is  assumed  that  spin-orbit  coupling  is 
strongest  along  the  molecular  z-direction  and  negligible  in  the  x-  and  y-directions. 
As  can  be  seen,  the  patterns  differ  significantly  from  one  another  so  that  it  is  possi- 
ble to  deduce  the  pathway  by  which  the  triplet  state  was  generated.  The  right  side  of 
the  Fig.  6.24  shows  the  experimental  transient  EPR  spectrum  obtained  from 
Photosystem  II  particles  that  lack  the  terminal  quinone  acceptors.  In  these  particles, 
charge  separation  generates  the  primary  radical  pairs  which  then  recombines  on  a 
nanosecond  timescale.  From  the  transient  EPR  spectrum  it  is  readily  apparent  that 
the  triplet  state  of  the  primary  donor  P680  is  populated  via  the  recombination. 
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Fig.  6.25  Energy  level 
diagram  for  a  spin  3/2  system 
with  zero-field  splitting 
parameters  D  =  +0.5  cm-1  and 
E=0.  The  applied  magnetic 
field  is  parallel  to  the  z-axis 
of  the  zero-field  splitting 
tensor.  The  double-headed 
arrows  indicate  the  field 
positions  at  which  X-band 
EPR  transitions  occur 
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6.5.3.4    High  Spin  (S  =  3/2, 5/2)  Metal  Center  Systems 

Metalloproteins  with  5=1/2  are  relatively  straightforward  to  analyze  by  EPR 
because  the  spectra  are  only  a  function  of  the  interaction  of  a  single  electron  with 
the  external  magnetic  field  (Zeeman  splitting)  and  with  the  field  of  nearby  magnetic 
nuclei  (hyperfine  interactions).  However,  when  5>l/2,  the  dipolar  interaction 
between  two  or  more  unpaired  electrons  also  contributes  and  spin-orbit  coupling 
causes  shifts  of  the  spin  states  with  different  values  of  |ms|  relative  to  one  another. 
High  spin  systems  are  found  in  heme  proteins,  which  can  exist  with  S=5/2  as  well 
as  5=1/2,  and  in  simple  [4Fe-4S]  clusters,  which  can  exist  with  S=3/2  as  well  as 
5=1/2.  The  relevant  parameters  in  a  high  spin  system  are  D,  the  axial  zero-field 
splitting  parameter,  and  E,  the  rhombic  splitting  parameter.  The  ratio  of  E  to  D  is 
termed  the  "rhombicity"  and  it  varies  from  E/D  =  0  in  an  exclusively  axial  system  to 
E/D=  1/3  in  an  exclusively  rhombic  system.  In  metalloproteins,  the  value  of  D  is 
often  much  larger  than  the  Zeeman  energy  at  X-band  (i.e.,  it  is  larger  than  -0.3  cm-1), 
so  that  the  separation  of  states  with  different  values  of  \ms\  is  large.  The  pairs  of 
states  with  m5=±l/2,  ±3/2,  etc.,  can  be  treated  as  pseudo  5=1/2  systems  termed 
Kramer's  doublets  (Fig.  6.25).  In  general,  an  S=n/2  multiplet  forms  (n+\)l2 
Kramer's  doublets  (i.e.,  5=3/2  forms  two  Kramer's  doublets  and  5=5/2  forms  three 
Kramer's  doublets).  At  zero  field,  the  microwaves  are  not  able  to  cause  transitions 
between  the  doublets  due  to  the  large  splitting,  but  as  a  magnetic  field  is  applied, 
each  doublet  gives  rise  to  its  own  set  of  resonances  as  if  it  were  an  5=  1/2  system. 
One  important  consideration  is  that  if  the  zero-field  splitting  is  larger  than  kT  some 
of  the  Kramer's  doublets  may  have  low  population  according  to  the  Boltzmann 
distribution.  However,  in  heme  and  iron-sulfur  proteins,  D  is  sufficiently  small  that 
even  at  4  K,  all  of  the  doublets  will  be  populated.  In  the  case  of  an  5=3/2  system 
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Fig.  6.26  Rhombograms  for  S=3/2  and  S=5/2  species  showing  the  effective  g-values  of  the 
powder  spectrum  features.  The  g-tensors  of  the  species  have  been  assumed  to  be  isotropic  and 
equal  to  the  free  electron  value.  The  zero-field  splitting  is  assumed  to  be  large  compared  to  the 
Zeeman  energy 

(Fig.  6.25),  two  sets  of  transitions  can  be  observed,  one  from  the  ±1/2  Kramer's 
doublet  and  the  other  from  the  ±3/2  Kramer's  doublet.  If  D  is  positive,  the  ±1/2 
Kramer's  doublet  is  lower  in  energy  and  is  termed  the  ground  doublet  and  the  ±3/2 
doublet  is  higher  and  is  called  the  excited  state.  For  D<0,  the  situation  is  reversed. 
In  both  cases  the  lower  Kramer's  doublet  will  be  preferentially  populated. 

These  concepts  can  be  generalized  for  systems  with  E/D>0  through  a  visual 
formalism  termed  a  "rhombogram,"  which  depicts  the  g- values  of  the  features  in  the 
EPR  powder  spectra  for  the  ground  and  excited  state  doublets  as  a  function  of  the 
ratio  of  E/D.  In  an  5=3/2  system  (Fig.  6.26,  left),  two  Kramer's  doublets  will  be 
present  and  for  each  of  these,  three  features  are  observed  corresponding  to  the  three 
principal  axes  of  the  zero-field  splitting  tensor.  For  an  axial  system  with  E/D  =  0,  the 
features  from  the  x  and  y  orientations  overlap  in  the  ground  doublet,  while  in  the 
excited  doublet  they  occur  at  infinite  field  (g  =  0)  so  only  three  features  are  observed 
at  g=2,  4,  and  6.  For  rhombic  systems  with  E/D>0,  six  features  will  occur,  three  in 
the  ground  state  doublet  and  three  in  the  excited  state  doublet.  The  resonances  asso- 
ciated with  the  ground  state  and  excited  state  doublets  can  be  distinguished  by 
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Fig.  6.27  EPR  spectrum  of  the  S=3/2  ground  state  [4Fe-4S]  cluster  Fx  in  the  photosynthetic  reac- 
tion center  of  Heliobacterium  modesticaldum  as  a  function  of  temperature.  The  principal  g-values 
are  depicted 

temperature  studies.  The  ground  state  is  preferentially  populated  relative  to  the 
excited  state  at  lower  temperatures  but  if  D>0,  the  excited  state  is  preferentially 
populated  relative  to  the  ground  state  at  higher  temperatures  because  of  its  higher 
multiplicity. 

In  practice,  if  the  ground  state  resonances  can  be  identified,  the  EID  value  can  be 
determined  and  the  excited  state  resonances  will  be  specified  by  the  rhombogram, 
and  conversely,  if  the  excited  state  resonances  can  be  identified,  the  EID  value  can 
be  similarly  determined,  and  the  ground  state  resonances  will  be  specified  by  the 
rhombogram.  Such  S=3/2  systems  are  typically  present  in  [4Fe-4S]  clusters  in 
which  one  cysteine  ligand  has  been  replaced  either  by  an  oxygen  ligand  or  an  exter- 
nal thiolate,  and  in  [4Fe-4S]  clusters  that  are  shared  between  a  protein  homodimer. 
Figure  6.27  shows  the  EPR  spectrum  of  the  5=3/2  system  present  in  the  interpoly- 
peptide  Fx  cluster  of  Heliobacterium  modesticaldum.  At  4.2  K,  the  spectrum  shows 
two  distinctive  features:  a  peak  at  g  =  5A  and  a  shoulder  at  g=4.4.  As  the  tempera- 
ture is  raised,  the  g  =  5A  peak  decreases  in  intensity  while  the  g=4.4  shoulder 
increases  in  intensity.  The  temperature  dependence  suggests  that  the  former  signal 
is  associated  with  the  ground  Kramer's  doublet,  whereas  the  latter  feature  is  associ- 
ated with  the  excited  Kramer's  doublet.  The  g- values  of  approximately  5.4  and  4.4 
are  roughly  that  expected  for  an  S=3/2  spin  system  exhibiting  a  rhombicity,  EID,  of 
approximately  0.2  (Fig.  6.26). 

In  an  S=5/2  system  three  Kramer's  doublets  will  be  present.  The  rhombogram 
(Fig.  6.26,  right)  depicts  the  resonances  expected  as  the  EID  value  is  varied  from  an 
exclusively  axial  system  to  an  exclusively  rhombic  system.  Such  S=5I2  systems  are 
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Fig.  6.28  X-bandEPR 
spectrum  of  horse  myoglobin 
at  pH  6.0.  The  principal 
g-values  are  depicted 
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present  in  heme  proteins  such  as  horse  myoglobin  at  pH  6.0,  which  is  a  pure  rhom- 
bic system  with  an  EID  value  close  to  0  (Fig.  6.28).  According  to  the  rhombogram, 
the  ground  state  S=  1/2  Kramer's  doublet,  which  will  be  preferentially  populated  at 
low  temperatures,  will  show  two  g- values  around  2  and  6.  The  middle  Kramer's 
doublet  will  also  show  the  g  =  6  resonance,  but  the  g  =  0  resonances  will  not  be 
observed.  The  highest  Kramer's  doublet,  which  will  be  lightly  populated  at  low 
temperature,  as  well  as  the  g  =  1  resonance  will  also  not  likely  be  observed.  Hence, 
high  spin  heme  proteins  typically  show  a  strong  resonance  around  g  =  6  with  only  a 
minor  feature  around  g  =  2. 

Another  interesting  example  of  an  S=5/2  is  octahedrally  coordinated  ferric  iron 
(commonly  known  as  "junk  iron"),  which  is  present  in  most  biological  samples.  It 
is  a  rhombic  system  with  an  EID  value  of  1/3.  According  to  the  rhombogram,  the 
ground  state  S=Vi  Kramer's  doublet,  which  is  preferentially  populated  at  low  tem- 
peratures, shows  two  g-values  less  than  1  and  one  g-value  greater  than  9,  but  the 
former  are  unlikely  to  be  observed.  The  same  consideration  holds  for  the  highest 
excited  state  Kramer's  doublet,  with  the  added  provision  that  it  will  be  less  popu- 
lated than  the  ground  state  doublet.  In  the  middle  Kramer's  doublet,  all  three  g- 
values  converge  to  4.29  at  the  rhombic  limit,  hence,  this  is  the  only  prominent 
resonance  observed  from  aqueous  Fe3+  ions.  These  resonances  are  therefore  rela- 
tively intense,  making  it  easy  to  overestimate  the  amount  of  adventitiously  bound 
iron  present  in  a  biological  sample. 


6.6    Concluding  Remarks 

Our  goal  in  this  overview  of  biological  EPR  is  to  give  an  introduction  to  the  method 
without  delving  into  the  complicated  mathematics  that  is  needed  for  a  complete 
description.  In  the  interest  of  clarity  and  because  of  space  constraints,  we  have  not 
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been  able  to  touch  on  many  elegant  and  important  aspects  of  EPR  spectroscopy. 
However,  we  hope  that  we  have  been  able  to  encourage  readers  and  students  new  to 
the  field  of  EPR  to  further  explore  the  rich  literature  of  the  technique. 
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Chapter  7 

Mass  Spectrometry 

Igor  A.  Kaltashov  and  Cedric  E.  Bobst 


Abstract  Mass  spectrometry  is  now  an  indispensable  tool  in  the  armamentarium  of 
molecular  biophysics,  where  it  is  used  for  tasks  ranging  from  covalent  structure 
determination  to  studies  of  higher  order  structure,  conformational  dynamics,  and 
interactions  of  proteins  and  other  biopolymers.  This  chapter  considers  the  basics  of 
biological  mass  spectrometry  and  highlights  recent  advances  in  this  field  (with  par- 
ticular emphasis  on  hydrogen  exchange,  chemical  cross-linking,  and  native  electro- 
spray  ionization  mass  spectrometry),  evaluates  current  challenges,  and  reviews 
possible  future  developments. 


7.1    Physical  Principles  and  Instrumentation 

Mass  spectrometry  (MS)  is  one  of  the  oldest  methods  of  instrumental  analysis  in 
chemistry,  this  year  being  the  centennial  of  the  construction  of  the  first  mass  spec- 
trometric  device  [1].  In  addition  to  rather  mundane  applications  related  to  molecular 
mass  measurements  (as  implied  by  its  name),  MS  can  be  used  for  a  variety  of  other 
tasks,  many  of  which  are  uniquely  suited  to  address  challenging  questions  in  molec- 
ular biophysics  and  structural  biology.  However,  it  was  not  until  the  advent  of  the 
two  ionization  techniques  capable  of  producing  ions  of  large  and  polar  molecules, 
electrospray  ionization  (ESI),  and  matrix-assisted  laser  desorption/ionization 
(MALDI),  that  MS  became  a  commonly  accepted  tool  in  the  armamentarium  of 
modern  molecular  biophysics. 
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7.1.1    Methods  of  Producing  Biomolecular  Ions 

MS  is  unique  among  the  analytical  techniques  commonly  applied  to  study  biomo- 
lecular structure  and  behavior  in  that  the  actual  physical  measurements  are  carried 
out  in  vacuum  or  in  the  gas  phase,  where  either  electric  field  alone  or  its  combina- 
tion with  a  magnetic  field  are  used  to  determine  ionic  mass-to-charge  ratios  (mlz). 
Placing  a  large  biomolecular  ion  in  vacuum  is  no  trivial  task,  and  the  absence  of 
robust  methods  to  do  so  were  limiting  the  utility  of  MS  in  the  biophysical  arena  until 
the  early  1990s. 

7.1.1.1    Electrospray  Ionization 

The  advent  of  ESI  MS  in  the  mid-1980s  [2]  provided  a  means  to  observe  spectra  of 
intact  proteins  with  no  apparent  mass  limitation,  an  invention  honored  with  a  Nobel 
Prize  in  Chemistry  to  John  Fenn  in  2002  [3].  Although  the  ESI  phenomenon  was 
known  and  extensively  studied  for  over  a  century,  and  the  realization  of  its  great 
analytical  potential  in  the  macromolecular  realm  had  become  apparent  as  early  as 
1960s  [4],  the  practical  applications  of  this  ionization  technique  were  limited  to 
small  biomolecules,  such  as  nucleobases,  amino  acids  [5],  and  short  peptides  [6,  7]. 
It  was  not  until  the  demonstration  of  the  ability  of  ESI  to  generate  ionic  signals  for 
protein  molecules  in  the  form  suitable  for  MS  analysis  [8]  that  this  technique  rap- 
idly gained  acceptance  and  recognition  among  MS  practitioners  and  quickly  became 
a  tool  of  choice  in  a  variety  of  studies  of  biomolecular  structure. 

ESI  is  a  convoluted  process,  whose  detailed  discussion  is  beyond  the  scope  of 
this  chapter.  Briefly,  the  protein  (or,  generally  speaking,  any  biopolymer)  solution  is 
sprayed  at  atmospheric  pressure  in  the  presence  of  a  strong  electrostatic  field,  which 
generates  metastable  electrically  charged  droplets  of  the  solvent  encapsulating  the 
protein  molecules.  Such  droplets  undergo  a  series  of  fission  events,  eventually  pro- 
ducing either  solvent-free  or  partially  solvated  protein  ions.  A  very  distinct  feature 
of  the  ESI  process  is  the  accumulation  of  multiple  charges  on  a  single  protein  mol- 
ecule, which  leads  to  the  appearance  of  multiple  peaks  in  a  mass  spectrum  even 
when  a  single  protein  is  present  in  solution  (Fig.  7.1a,  b).  In  most  cases  multiple 
charging  is  the  result  of  protonation  of  a  number  of  different  sites  within  the  protein 
molecule,  although  other  ubiquitous  charge  carriers  (such  as  Na+,  K+,  NH4+)  may 
also  contribute.  A  set  of  ion  peaks,  each  representing  the  same  protein  molecule  and 
differing  from  the  rest  by  the  extent  of  multiple  charging,  is  usually  referred  to  as  a 
charge  state  distribution.  Determination  of  the  protein  mass  based  on  the  experi- 
mentally measured  charge  state  distribution  is  relatively  straightforward,  and  can  be 
easily  accomplished  using  a  variety  of  deconvolution  routines  even  if  the  mass  spec- 
trum contains  several  overlapping  charge  state  distributions  representing  different 
biomolecules. 

Most  ESI  MS  analyses  are  carried  out  in  the  positive  ion  mode  (where  biopoly- 
mer molecules  are  represented  in  mass  spectra  with  polycationic  species),  but  one 
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Fig.  7.1  ESI  mass  spectra  of  a  peptide  SWANGDEAR  (a)  and  trypsin  (b).  The  panels  on  the  left 
represent  full-scan  mass  spectra,  and  the  panels  on  the  right  show  detailed  views  of  a  single  charge 
state  (the  three  traces  in  each  case  represent  mass  spectra  acquired  with  a  triple  quadrupole  MS, 
hybrid  quadrupole/TOF  MS,  and  FT  ICR  MS),  with  the  insets  showing  zoomed  views  of  mass 
spectra  acquired  with  quadrupole/TOF  and  FT  ICR  MS.  Note  that  although  the  resolving  power  of 
TOF  is  sufficient  to  resolve  isotopic  peaks  of  the  peptide  ion,  it  fails  to  detect  the  presence  of  a 
degraded  (de-amidated)  form  of  this  peptide  (between  m/z  503.7  and  503.8).  Isotopic  distribution 
of  trypsin  ions  can  only  be  resolved  by  FT  ICR  MS,  although  both  quadrupole/TOF  and  FT  ICR 
MS  can  resolve  contributions  of  three  different  isoforms  of  this  protein 


can  easily  produce  polyanionic  species  as  well  simply  by  switching  the  polarity  of 
the  ESI  source.  In  this  case  multiple  charging  of  macromolecules  will  be  achieved 
by  removing  labile  protons  from  the  analyte  molecule  (de-protonation).  While  pro- 
teins are  usually  analyzed  by  ESI  MS  in  the  positive  ion  mode,  switching  to  the 
negative  ion  mode  could  be  advantageous  for  certain  other  biopolymers,  such  as 
nucleic  acids.  It  must  be  stressed,  however,  that  for  any  biopolymer  both  positive 
and  negative  ion  spectra  can  be  produced,  and  the  charge  state  distributions  in  these 
spectra  do  not  reflect  the  charge  balance  in  solution  [9]. 


7.1.1.2  MALDI 

Another  approach  to  producing  macromolecular  ions  and  transferring  them  to  vac- 
uum was  introduced  at  about  the  same  time  ESI  MS  was  developed;  unlike  ESI  it 
produces  ions  not  from  the  bulk  of  the  solution,  but  from  the  interface  of  a 
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condensed  phase  (usually  solid  crystals)  and  the  vacuum.  This  task  is  accomplished 
by  mixing  the  analyte  molecules  with  an  excess  of  UV  light-absorbing  small  organic 
molecules,  which  form  the  sample  matrix,  followed  by  irradiation  with  a  UV  laser 
beam.  This  results  in  rapid  local  heating  of  the  matrix  and  subsequent  ejection  of  a 
plume  containing  both  matrix  and  analyte  molecules  from  the  solid  surface  to  the 
gas  phase  and  their  ionization.  This  technique,  presently  known  as  MALDI  was 
developed  simultaneously  by  Koichi  Tanaka  [10]  and  Franz  Hillenkamp  and 
Michael  Karas  [11]. 

Biopolymer  ions  produced  by  MALDI  can  also  carry  multiple  charges;  however, 
the  extent  of  protonation  is  significantly  below  that  achieved  with  ESI.  Generally, 
MALDI  MS  surpasses  conventional  ESI  MS  in  terms  of  sensitivity  and  is  more 
tolerant  to  salts.  Superior  sensitivity,  relative  simplicity  of  operation,  and  ease  of 
automation  have  made  it  a  top  choice  as  an  analytical  technique  for  a  variety  of 
proteomics-related  applications.  On  the  other  hand,  MALDI  mass  spectra  generally 
are  not  as  reproducible  as  ESI  mass  spectra.  Also,  interfacing  MALDI  with  separa- 
tion techniques,  such  as  liquid  chromatography  (LC),  is  more  difficult  than  coupling 
LC  to  ESI  MS. 


7.1.2    Mass  Measurements 

Mass  (or,  more  precisely,  mass-to-charge  ratio,  mlz)  of  an  ion  can  be  determined  by 
MS,  because  this  characteristic  of  a  charge-carrying  particle  uniquely  defines  its 
trajectory  in  electric  (E)  and  magnetic  (B)  fields,  as  well  as  their  combinations: 

m?  =  ze(E  +  [rXB]).  (7-!) 

Here  ze  is  the  ionic  charge  expressed  as  a  multiple  of  the  elementary  charge  e 
(1.6022  x  10-19  C  in  SI),  m  is  its  mass,  while  the  first  and  second  time  derivatives  of 
the  trajectory  vector  represent  its  velocity  and  acceleration,  respectively.  Mass  mea- 
surements are  actually  carried  out  by  first  separating  the  ions  (either  spatially  or 
temporally)  according  to  their  mlz  ratios,  followed  by  detection  of  each  type  of  ion, 
although  other  schemes  exists  where  no  physical  separation  of  ions  is  required  prior 
to  their  detection  and  mass  measurement  {vide  infra). 

The  ionic  mlz  ratio  measured  by  MS  in  most  cases  can  be  easily  converted  to 
the  ionic  mass  (after  taking  into  the  account  the  multiple  charging  effect)  and, 
ultimately,  to  the  molecular  mass  of  the  analyte  (after  taking  into  the  account  the 
finite  mass  of  the  charge  carriers,  residual  solvent,  and  other  adducts).  The  notion 
of  molecular  mass  (measured  in  unified  atomic  mass  units,  defined  by  IUPAC  as 
1/12  of  the  mass  of  a  12C  atom  in  its  ground  state,  u  «  1.660  5402(10)  x  10~27  kg)  is 
closely  related  to  the  concept  of  molecular  weight,  a  sum  of  the  atomic  weights  of 
all  atoms  in  a  given  molecule.  However,  the  atomic  weight  of  an  element  is  a 
weighted  average  of  the  atomic  masses  of  all  of  its  stable  isotopes,  and  the  isoty- 
pic make-up  is  implicitly  included  in  the  definition.  Contributions  of  isotopes  are 
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not  necessarily  averaged  out  when  ionic  masses  are  measured  by  MS,  and  in  many 
cases  such  measurements  produce  a  distribution  of  masses,  rather  than  a  single 
value.  This,  of  course,  depends  on  the  physical  size  of  the  analyte  molecule  and 
the  mass  resolution  characteristics  of  the  MS  instrument  (vide  infra).  Most  mod- 
ern MS  instruments  are  capable  of  resolving  isotopic  distributions  for  relatively 
short  peptides  (Fig.  7.1a),  while  accomplishing  the  same  task  for  proteins  requires 
more  technologically  advanced  (and  alas,  more  expensive)  instrumentation 
(Fig.  7.1b). 

To  avoid  ambiguity  in  reporting  molecular  masses,  one  can  use  the  notion  of  an 
average  mass,  which  is  calculated  based  on  the  entire  isotopic  distribution  and  is 
closely  related  to  the  molecular  weight  as  used  elsewhere  in  chemistry  and  related 
disciplines.  In  some  applications,  however,  a  monoisotopic  mass  would  be  a  pre- 
ferred way  of  reporting  the  molecular  mass  with  high  precision  and  accuracy  (it  is 
calculated  based  on  contributions  only  from  the  lightest  isotope  for  each  element). 
Obviously,  the  use  of  the  monoisotopic  mass  in  reporting  the  MS  measurement 
results  is  justified  only  if  the  resolution  is  high  enough  to  afford  separation  of  isoto- 
pic peaks  in  the  mass  spectra  and  the  monoisotopic  peak  is  one  of  the  most  abundant 
peaks  in  the  distribution. 


7.1.3    Tandem  Mass  Spectrometry 

The  most  attractive  features  of  both  ESI  and  MALDI  are  their  ability  to  generate 
intact  macromolecular  ions  in  the  form  suitable  for  mass  measurement.  However, 
this  information  is  not  sufficient  in  most  instances  for  unequivocal  identification  of 
even  small  peptides,  let  alone  large  macromolecules.  This  task  requires  at  least 
some  knowledge  of  the  covalent  structure,  which  can  be  obtained  by  inducing  dis- 
sociation of  macromolecular  ions  and  measuring  the  masses  of  the  resulting  frag- 
ment ions.  Since  most  proteins  and  peptides  are  linear  polymers,  cleavage  of  a 
single  covalent  bond  along  the  backbone  generates  a  fragment  ion  (or  two  comple- 
mentary fragment  ions  if  the  charge  of  the  precursor  ion  z  =  2  or  higher)  classified  as 
an  a-,b-,  c-  or  x-,y-,  z-type  [12,  13],  depending  on  (1)  the  type  of  the  bond  cleaved 
and  (2)  whether  the  fragment  ion  contains  an  N-  or  C-terminal  portion  of  the  peptide 
(Fig.  7.2).  Ion  dissociation  is  usually  carried  out  following  isolation  of  the  ion  of 
interest  from  other  ionic  species  that  may  be  present  in  the  mass  spectrum.  This 
approach,  known  as  tandem  mass  spectrometry  or  MS/MS,  allows  the  fragment  ion/ 
precursor  ion  correlation  to  be  established  easily  [14]  and  is  indispensable  for  many 
biophysical  applications  of  MS  (vide  infra). 

The  majority  of  tandem  MS  experiments  employ  various  means  of  increasing 
internal  energy  of  the  precursor  ion  to  induce  its  dissociation.  Collisional  activation 
remains  the  most  widely  used  method  of  elevating  ion  internal  energy  [15],  which 
typically  yields  b-  and  ^-ions,  although  collision-induced  dissociation  (CID)  at 
high  energy  may  also  lead  to  formation  of  other  fragments,  particularly  a-  and 
x-type  (Fig.  7.3a,  b).  Excitation  of  ions  leading  to  their  dissociation  can  also  be 
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Fig.  7.2  Biemann's  nomenclature  of  peptide  ion  fragments  [12].  Fragment  ions  shown  in  gray 
boxes  correspond  to  either  complete  or  partial  loss  of  the  side  chains  and  are  usually  observed  only 
in  high-energy  CID 


achieved  using  other  means,  such  as  interaction  with  photons  (a  technique  known 
as  infrared  multi-photon  dissociation,  IRMPD  [16])  or  with  electrons  (two  closely 
related  techniques,  known  as  electron  capture  dissociation,  ECD  [17]  and  electron 
transfer  dissociation,  ETD  [18]).  While  the  outcome  of  IRMPD  is  usually  very 
similar  to  low-energy  CID,  ECD,  and  ETD  typically  generate  c-  and  z-fragments, 
and  often  provide  more  extensive  sequence  coverage  in  polypeptides  compared  to 
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Fig.  7.3  High-energy  CID  (a),  low-energy  CID  (b),  and  ECD  (c)  fragmentation  spectra  of  a 
2.8  kDa  melittin  peptide.  Only  the  most  abundant  fragment  ions  are  labeled  in  the  spectra 


conventional  CID  (Fig.  7.3c).  Another  very  attractive  feature  of  electron-based 
fragmentation  techniques  is  their  ability  to  preserve  labile  groups  introduced 
through  posttranslational  modification  (PTM)  of  proteins  and  cleave  disulfide  bonds 
in  peptide  polycations  [19],  a  challenging  task  when  other  methods  of  ion  activation 
are  employed.  The  fragmentation  patterns  produced  by  ECD  and  ETD  are  fre- 
quently complementary  to  the  CID-generated  fragments  [20],  hence  the  benefit  of 
using  multistage  fragmentation  (the  so-called  MSn  experiments)  consisting  of  both 
CID  and  ECD  (or  ETD). 
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7.1.4    Common  Types  of  Mass  Analyzers 

As  has  been  already  mentioned  in  this  chapter,  mlz  measurements  of  macromolecu- 
lar  ions  by  MS  rely  on  the  unique  dependence  of  the  ionic  trajectory  in  electric  and 
magnetic  fields  on  this  parameter  as  shown  in  equation  (7.1).  The  practical  imple- 
mentation of  this  principle  takes  a  wide  variety  of  approaches,  hence  a  great  number 
of  mass  analyzers  which  differ  from  each  other  not  only  by  the  amount  and  quality 
of  information  that  can  be  extracted  from  mass  measurements  but  also  by  price. 
Given  the  obvious  space  limitations  of  this  volume,  we  cannot  provide  extensive 
coverage  of  all  available  types  of  mass  analyzers,  but  instead  focus  our  attention  on 
three  different  types  representing  the  ends  and  the  middle  of  both  performance  and 
price  scales.  These  are  quadrupole,  time-of-flight  (TOF),  and  Fourier  transform  ion 
cyclotron  resonance  (FT  ICR)  mass  spectrometers. 


7.1.4.1    Quadrupole,  Triple  Quadrupole,  and  Ion  Trap  MS 

Strictly  speaking,  quadrupole  MS  should  be  called  a  mass  filter,  rather  than  a  mass 
analyzer,  since  the  dynamic  quadrupolar  electric  field  employed  by  this  device 
allows  ions  within  a  narrow  mlz  range  to  be  transmitted  through  this  device  and 
eventually  reach  a  detector,  while  all  other  ions  assume  unstable  trajectories  and  are 
lost  prior  to  detection  (Fig.  7.4).  The  mlz  range  of  a  typical  quadrupole  MS  is  lim- 
ited to  4,000  (with  many  commercial  instruments  having  even  less  generous  mlz 
limits).  The  mass  resolution  of  a  quadrupole  MS  is  not  constant  across  the  mlz  scale, 
and  rarely  exceeds  the  level  of  several  thousands.  On  the  other  hand,  these  devices 
provide  good  sensitivity  and  are  capable  of  obtaining  mass  spectra  fast  enough  to 
allow  direct  coupling  to  LC.  MS/MS  experiments  can  be  carried  out  if  three  quad- 
rupoles  are  arranged  in  tandem  (a  configuration  referred  to  as  QqQ,  or  so-called 
triple  quadrupole  MS).  The  first  quadrupole  is  set  to  transmit  ions  of  certain  mlz 
value  (precursor  ions),  while  the  second  is  used  as  a  collision  cell  and  transmits  all 
ions  (precursor  and  CID  fragments)  into  the  third  quadrupole,  which  is  scanned  to 
obtain  a  fragment  ion  spectrum. 

Other  MS/MS  experiments  can  be  designed;  for  example,  the  third  quadrupole 
can  be  set  to  allow  the  transmission  of  fragment  ions  at  certain  mlz  values,  while 
the  first  quadrupole  is  scanned.  Mass  spectra  acquired  in  this  mode  contain  peaks 
of  all  ions  whose  fragmentation  gives  rise  to  a  selected  fragment  (the  so-called 
precursor  ion  scans).  Alternatively,  scanning  both  first  and  third  quadrupole  filters 
at  the  same  rate  but  with  a  fixed  mlz  offset  while  generating  fragment  ions  in  the 
second  nondiscriminating  quadrupole  produces  a  spectrum  of  ions  that  undergo 
fragmentation  via  loss  of  a  specific  neutral  fragment  (the  so-called  constant  neutral 
loss  scans).  Triple  quadrupole  mass  spectrometers  are  indispensible  in  applications 
that  require  quantitation  of  both  small  organic  and  biological  analytes  to  be  carried 
out.  However,  modest  resolution  and  mlz  range  of  such  mass  spectrometers  limit 
their  use  in  biophysical  and  structural  biology  studies,  although  these  devices  are 
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Fig.  7.4  A  schematic 
representation  of  a 
quadrupole  mass  filter  with 
examples  of  stable  and 
unstable  ion  trajectories 


often  interfaced  with  other  (higher-end)  mass  analyzers  to  produce  hybrid  mass 
spectrometers. 

Quadrupolar  devices  can  also  be  used  to  construct  a  different  type  of  a  mass 
analyzer,  one  where  instead  of  being  analyzed  in  a  single  pass  through  the  dynamic 
quadrupolar  field  region,  ions  are  stored  (trapped)  for  prolonged  periods  of  time 
[21].  The  simplest  design  of  such  an  ion  trap  is  a  segmented  quadrupole  (based  on 
a  triple  quadrupole  design),  in  which  the  central  pressurized  segment  confines  the 
ions  radially  in  a  dynamic  (radio  frequency)  quadrupolar  field,  while  the  terminal 
segments  provide  repulsive  DC  potentials  at  either  end  that  prevents  the  ions  from 
escaping  the  central  quadrupole  in  the  axial  direction.  An  alternative  design  (which 
is  frequently  referred  to  as  a  3D  ion  trap  to  distinguish  it  from  the  linear  trap 
described  above)  can  be  viewed  as  a  single  quadrupole  filter  that  has  been  made 
into  a  toroidal  device  by  connecting  the  opposite  ends  of  each  quadrupole  rod  and 
then  "collapsing"  this  four-ring  structure  towards  its  axis  of  radial  symmetry.  In 
this  case  only  one  ring  (the  furthest  from  the  axis)  remains  a  ring,  while  the  one 
closest  to  the  axis  completely  disappears,  and  two  other  rings  become  endcaps 
flanking  the  remaining  ring.  This  three- electrode  system  can  be  used  to  create  a  3D 
quadrupolar  electrical  field,  which  confines  ions  within  this  device,  a  process  that 
is  greatly  facilitated  by  the  presence  of  He  gas,  which  remove  excess  energy  from 
ions  via  the  so-called  collisional  damping  [22,  23].  Gradual  variation  of  electrode 
potentials  destabilizes  the  trapped  ions  in  an  ra/z-sensitive  fashion  and  forces  them 
to  leave  the  confines  of  the  trap,  a  feature  that  enables  both  MS  measurements  and 
precursor  ion  selection  for  MS/MS  experiments;  this  field-induced  external  excita- 
tion can  also  be  used  to  ramp-up  the  energy  of  the  ions,  which  is  then  converted  to 
internal  energy  upon  collisions  with  He  atoms,  and  eventually  leads  to  ion  disso- 
ciation [22-25]. 

A  very  significant  advantage  of  both  types  of  ion  trapping  devices  described 
above  over  their  progenitor  quadrupole  MS  is  that  MS/MS  measurements  can  be 
carried  out  within  a  single  analyzer,  without  the  need  to  have  a  dedicated  collision 
cell  and  a  second  mass  analyzer.  Furthermore,  any  of  the  fragment  ions  produced  in 
the  course  of  an  MS/MS  experiment  can  also  be  isolated  in  the  trap,  collisionally 
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activated  and  fragmented,  followed  by  the  acquisition  of  a  mass  spectrum  of  the 
second  generation  of  fragment  ions.  This  process  can  be  repeated  any  number  of 
times,  as  long  as  the  number  of  ions  remaining  in  the  trap  is  high  enough  to  provide 
a  usable  signal-to-noise  ratio.  Such  experiments  are  referred  to  as  multi-stage 
tandem  MS,  or  simply  MSn.  Due  to  significant  improvements  in  the  performance  of 
ion  traps  in  the  past  2  decades,  ease  of  operation  and  relatively  low  cost,  they  have 
become  very  popular,  both  as  standalone  mass  spectrometers  and  as  part  of  hybrid 
instruments.  Limitations  of  ion  traps  are  similar  to  those  of  quadrupole  MS:  modest 
mass  resolution  and  relatively  low  upper  limit  of  the  mlz  range  where  MS  (and  MS/ 
MS)  data  can  be  collected. 

7.1.4.2    TOF  MS  and  Hybrid  Quadrupole/TOF  MS 

Ion  separation  in  the  TOF  MS  is  based  on  the  fact  that  the  velocity  v  of  an  ion  accel- 
erated in  an  electrostatic  field  will  be  determined  by  the  magnitude  of  the  accelera- 
tion potential  U0  and  the  ionic  mlz  ratio.  Measuring  the  time  period  needed  to 
traverse  a  field-free  drift  region  of  length  D  would  then  allow  the  ionic  mlz  ratio  to 
be  determined: 


This  approach,  however,  results  in  relatively  poor  mass  resolution,  mostly  due 
to  a  significant  spread  of  ionic  kinetic  energies  prior  to  acceleration.  To  correct  this, 
several  approaches  can  be  used,  where  energy  focusing  of  the  ions  is  done  by 
delaying  ion  acceleration  using  pulsed  (delayed)  extraction  [26]  or  by  using  the 
so-called  ion  mirror  or  reflectron  [27].  The  principle  of  the  reflectron  operation  is 
illustrated  in  Fig.  7.5:  if  two  identical  ions  have  different  velocities,  the  faster  ion 
will  penetrate  deeper  into  the  decelerating  region  of  the  reflectron,  and  its  overall 
trajectory  path  will  be  longer.  After  its  reemergence  from  the  reflectron,  this  ion 
would  still  have  a  higher  velocity,  but  it  will  be  lagging  behind  the  slower  ion  due 
to  spending  longer  time  in  the  decelerating  region.  Such  relatively  simple  single- 
stage  reflectrons  can  only  perform  first  order  velocity  focusing,  but  more  sophisti- 
cated devices  (e.g.,  double  stage  ion  mirrors)  can  provide  velocity  focusing  to  a 
higher  order  [28,  29]. 

Reflectrons  also  allow  MS/MS  measurements  to  be  carried  out  with  a  single  TOF 
mass  analyzer  [28],  although  a  combination  of  two  TOF  analyzers  or  a  hybrid 
instrument  consisting  of  TOF  and  another,  lower  resolution,  mass  analyzer  (such  as 
a  quadrupole  MS)  usually  offer  more  flexibility  in  experiment  design  and  deliver 
better  data  quality.  A  hybrid  quadrupole-TOF  instrument  is  a  particularly  popular 
configuration,  which  is  offered  by  several  manufacturers  of  MS  instrumentation. 
Typically,  a  front-end  quadrupole  is  used  for  mass-selection  of  precursor  ions,  fol- 
lowed by  an  RF-only  quadrupole  serving  as  a  collision  cell,  the  fragment  ions  are 
then  analyzed  with  high  resolution  by  a  reflectron-equipped  TOF  section  of  the 
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Fig.  7.5  Schematic  diagrams  of  linear  (top)  and  single-stage  reflectron  (bottom)  time-of-flight 
mass  spectrometers 


instrument.  MS  1  measurements  are  carried  out  by  operating  both  quadrupole  seg- 
ments in  the  RF-only  mode,  so  that  they  only  serve  as  ion  guides;  all  mass  measure- 
ments are  carried  out  by  the  TOF  analyzer,  which  offers  both  better  resolution 
(> 10,000)  and  mlz  range  vastly  superior  to  that  of  the  quadrupole  MS. 


7.1.4.3    FT  ICR  MS 

FT  ICR  MS  is  an  example  of  a  high-performance  mass  spectrometer  employing  an 
ion  trapping  mass  analyzer.  However,  unlike  its  relatively  inexpensive  cousins,  the 
quadrupolar  ion  trap  and  linear  ion  trap  considered  in  Sect.  1.4.1,  it  offers  unparal- 
leled mass  resolution  and  unmatched  mass  accuracy  (another  high-performance 
mass  analyzer  based  on  the  ion  trapping  principle  is  the  orbitrap  MS  [30,  31]).  Ion 
trapping  is  achieved  in  FT  ICR  MS  by  using  a  combination  of  electrostatic  and 
magnetic  fields,  as  shown  in  a  schematic  form  in  Fig.  7.6.  A  DC  potential  applied  to 
the  front  and  back  plates  of  the  cubic  cell  restricts  the  ionic  motion  along  the  z-axis, 
essentially  locking  the  ions  in  the  cell  following  their  injection  from  the  external 
source.  A  strong  magnetic  field  (typically  4.7-12.0  T)  applied  in  the  direction  of  the 
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frequency  > 


Fig.  7.6  Principal  of  ion  trapping,  broadband  excitation  and  detection  in  FT  ICR  MS.  Reproduced 
with  permission  from  [132] 


z-axis  exerts  a  Lorentz  force  on  the  trapped  ions,  which  acts  as  a  centripetal  force, 
inducing  a  circular  (cyclotron)  motion  in  the  (x,  y)  plane.  The  frequency  of  the 
cyclotron  motion  ooc  is  independent  of  the  ionic  energy,  but  is  uniquely  determined 
by  its  mlz  ratio  and  the  strength  of  the  magnetic  field  B: 

zeB  ,n  i  \ 

coc=  ,  (7.3) 

m 

providing  the  physical  basis  of  the  mass  measurement.  Since  frequency  is  a  physi- 
cal parameter  that  can  be  measured  very  accurately,  mass  spectrometers  based  on 
the  principle  of  cyclotron  motion  can  provide  the  highest  accuracy  in  mlz 
measurements. 

Ion  detection  in  FT  ICR  MS  is  done  by  measuring  the  magnitude  of  the  image 
current  induced  on  the  detection  plates  by  the  ion  orbiting  in  the  space  between 
them  (Fig.  7.6).  Since  unsynchronized  motion  of  a  large  number  of  ions  generates 
zero  net  current,  ion  detection  must  be  preceded  by  ion  excitation  (e.g.,  by  applying 
a  uniform  harmonic  electric  field  in  the  direction  orthogonal  to  the  magnetic  field). 
If  the  field  frequency  is  the  same  as  the  cyclotron  frequency  of  the  orbiting  ions, 
they  will  be  synchronized  (brought  in  phase  with  the  field).  Such  resonant  excitation 
also  elevates  ion  kinetic  energy,  increasing  the  radii  of  their  orbits,  which  leads  to 
the  increase  of  the  image  current  induced  by  each  ion.  Synchronized  ions  of  the 
same  mlz  ratio  induce  an  image  current,  whose  angular  frequency  oo  is  equal  to  their 
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cyclotron  frequency  ooc  and  the  current  amplitude  is  proportional  to  the  number  of 
ions  in  the  cell  [32].  If  several  types  of  ions  (with  different  mlz  ratios)  are  present  in 
the  cell,  their  excitation/synchronization  requires  application  of  a  broadband  chirp 
as  opposed  to  a  harmonic  signal,  and  the  resulting  image  current  is  a  superposition 
of  several  sinusoidal  signals  (the  actual  cyclotron  frequency  in  a  real  ICR  cell  is 
lower  than  ooc  due  to  the  influence  of  a  trapping  electrostatic  potential).  Fourier 
transformation  of  such  a  spectrum  allows  the  cyclotron  frequencies  of  all  ions  to  be 
determined  and  their  mlz  values  calculated  (Fig.  7.6). 

Apart  from  ultra-high  mass  resolution  and  accuracy,  a  great  advantage  offered  by 
FT  ICR  MS  over  most  other  mass  analyzers  is  that  it  allows  all  ions  across  a  wide 
mlz  range  to  be  detected  (1)  simultaneously  within  a  very  short  period  of  time  and 
(2)  in  a  nondestructive  fashion.  The  latter  feature  allows  the  data  acquisition  to  be 
carried  out  with  the  same  ion  population  over  an  extended  period  of  time  using 
multiple  remeasurements,  forming  the  basis  of  the  MSn  (as  opposed  to  MS/MS  or 
MS2)  experiments.  Ion  isolation  in  the  ICR  cell  can  be  achieved  using  inverse  FT 
(from  the  frequency  to  the  time  domain),  and  fragmentation  of  the  isolated  ions  can 
be  induced  by  either  collisional  activation  or  electron  capture  (other  methods  of  ion 
activation,  such  as  IRMPD,  are  also  available).  Combining  FT  ICR  MS  with  another 
mass  analyzer  (e.g.,  quadrupole)  as  a  front  end  leads  to  further  expansion  of  the 
repertoire  of  the  ion  fragmentation  techniques,  e.g.,  by  allowing  ETD  to  be  carried 
out  under  conditions  of  relatively  high  pressure  prior  to  introduction  of  fragment 
ions  to  the  ICR  cell  for  either  high-resolution  mass  analysis  or  interrogation  with 
orthogonal  ion  fragmentation  techniques  that  can  be  performed  in  the  high- vacuum 
environment  of  the  ICR  cell.  Combination  of  several  ion  fragmentation  techniques 
in  one  experiment  often  provides  significant  improvement  of  the  sequence  coverage 
of  macromolecular  ions  [33]. 


7.2    Analysis  of  Covalent  Structure 

7.2.1    Covalent  Structure  of  Polypeptides  and  Proteins 

Tandem  mass  spectrometry  provides  the  means  to  obtain  information  on  covalent 
structure  of  polypeptides  and  proteins  by  employing  a  combination  of  various 
MS -based  techniques.  Typically,  these  are  grouped  in  two  broad  categories,  the  so- 
called  bottom-up  and  top-down  approaches,  which  are  considered  in  the  following 
sections. 


7.2.1.1    Polypeptide  Sequencing:  The  Bottom-Up  Approach 

The  classical  approach  to  polypeptide  sequencing  by  MS  relies  on  enzymatic  cleav- 
age of  a  protein  to  relatively  short  peptides,  followed  by  their  separation  by  LC  and 
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Fig.  7.7  An  example  of  using  LC/MS/MS  to  obtain  protein  sequence  information.  A  purified 
66  kDa  protein  bovine  serum  albumin  has  been  digested  with  trypsin,  followed  by  separation  of 
proteolytic  fragments  on  a  reversed-phase  (Ci8)  column  with  online  ESI  MS  detection.  The  black 
trace  in  the  top  panel  shows  the  total  ionic  signal  recorded  by  ESI  MS  as  a  function  of  the  elution 
time,  while  the  red  and  blue  traces  represent  ionic  signals  at  two  specific  m/z  values,  which  corre- 
spond to  two  proteolytic  peptide  ions,  TCVADESHAGCEK64  (charge  state  +3;  both  cysteine  side 
chains  are  fully  reduced  and  methylated)  and  TVMENFVAFVDK556  (charge  state  +2).  The  MS/ 
MS  spectra  of  these  two  peptide  ions  acquired  in  a  data-dependent  fashion  (by  selecting  the  most 
abundant  ion  in  MSI  spectrum  as  a  precursor  for  CAD)  are  shown  in  the  bottom  panels.  All  struc- 
turally diagnostic  ions  are  labeled  in  the  mass  spectra,  and  the  corresponding  backbone  cleavage 
positions  are  shown  within  each  peptide's  sequence 


analysis  of  their  structure  using  MS/MS  methods  [34] .  The  chromatographic  step  is 
usually  combined  with  MS  and/or  MS/MS  analysis,  which  frequently  allows  a  great 
wealth  of  sequence  information  to  be  obtained  in  a  single  LC/MS/MS  experiment 
(Fig.  7.7).  The  entire  procedure  can  be  automated  on  most  commercial  instruments, 
which  allows  MS/MS  operation  to  be  performed  in  a  data-dependent  fashion,  while 
the  data  interpretation  step  is  frequently  assisted  by  database  searches.  The  latter 
allows  peptides  and  proteins  to  be  identified  even  if  the  fragmentation  patterns  con- 
tain significant  sequence  gaps. 
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7.2.1.2  Polypeptide  Sequencing:  The  Top-Down  Approach 

The  top-down  approach  to  polypeptide  and  protein  sequencing  completely  bypasses 
the  enzymatic  degradation  step,  with  all  structural  information  derived  from  disso- 
ciation of  the  intact  protein  or  polypeptide  ion  in  the  gas  phase  [35].  While  this 
approach  has  been  used  successfully  by  many  groups  to  obtain  sequence  informa- 
tion on  relatively  small  proteins  (<30  kDa),  its  application  to  larger  proteins  is  not 
straightforward  even  when  high-end  instrumentation  is  used.  Nevertheless,  success- 
ful utilization  of  this  methodology  was  demonstrated  for  identification  of  proteins 
beyond  500  kDa  [36],  although  such  examples  remain  very  rare. 

7.2.1.3  Posttranslational  Modifications 

Analysis  of  PTM  of  proteins  is  another  area  where  MS-based  methods  of  analysis 
are  now  playing  a  major  role.  Due  to  the  labile  nature  of  many  PTMs,  application  of 
traditional  MS/MS  approaches  to  identify  specific  modifications  and  localize  them 
within  the  protein  sequence  meets  only  with  limited  success.  For  example,  colli- 
sional  activation  of  glyco-  and  phospho-peptides  frequently  leads  to  facile  removal 
of  PTM  moieties  prior  to  cleavage  of  the  peptide  backbone,  leaving  no  mass  tags  on 
amino  acid  residues  that  were  modified  and  making  their  identification  a  challeng- 
ing task.  However,  the  electron-based  ion  dissociation  techniques  (such  as  ECD  and 
ETD)  allow  this  conundrum  to  be  solved,  since  the  fragmentation  events  are  highly 
localized  and  do  not  require  accumulation  of  vibrational  energy  within  the  peptide 
ion  over  an  extended  period  of  time  (as  does  CAD). 

7.2.1.4  Covalent  Structure  of  Other  Biopolymers 

While  the  analysis  of  protein  covalent  structure  by  MS-based  methods  gained  the 
most  recognition  and  is  in  fact  the  default  approach  to  obtaining  both  amino  acid 
sequence  information  and  mapping  PTMs,  structural  analysis  of  other  biopolymers 
also  benefitted  enormously  from  recent  improvements  in  MS  hardware  and  method- 
ology. For  example,  both  MALDI  and  ESI  MS  had  been  used  successfully  to  mea- 
sure masses  of  intact  RNA  molecules  and  other  nucleic  acids;  however,  these 
analyses  frequently  present  a  number  of  challenges,  mostly  due  to  the  ability  of  the 
phosphodiester  backbone  of  nucleic  acids  to  form  adducts  with  alkali  and  alkaline 
earth  metal  cations.  This  typically  leads  to  very  broad  peaks  in  mass  spectra 
(Fig.  7.8),  although  extensive  buffer  exchange  into  volatile  ammonium  salts  to  dis- 
place metal  cations,  desalting  by  metal  chelation  or  HPLC  can  improve  the  spectral 
quality.  Sequence  information  can  be  obtained  by  means  of  MS/MS,  or  simply  by 
inducing  fragmentation  in  the  ionization  source,  e.g.,  by  increasing  the  laser  power 
in  MALDI  measurements.  Dissociation  of  nucleic  acids  along  the  phosphodiester 
backbone  produces  structurally  diagnostic  ions,  and  these  fragment  ion  ladders 
(Fig.  7.9)  can  be  used  to  determine  the  oligonucleotide  sequence.  This  approach  to 
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Fig.  7.8  ESI  mass  spectrum  of  £RNAThr  acquired  with  a  hybrid  quadrupole/TOF  MS  (10  uM  in 
20  mM  ammonium  acetate) 


Fig.  7.9  Prompt  fragmentation  in  MALDI  MS:  UV-MALDI  spectra  of  an  oligonucleotide  strand 
acquired  at  increased  (top  trace)  and  moderate  laser  power.  Adapted  with  permission  from  [132] 
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oligonucleotide  sequencing  is  analogous  to  how  peptide  fragmentation  patterns 
reveal  the  amino  acid  sequence  (vide  supra),  although  it  currently  remains  practical 
only  for  relatively  short  oligonucleotides. 

MS/MS  methods  can  also  be  applied  to  obtain  information  on  covalent  structure 
of  another  type  of  biopolymer,  polysaccharides,  although  these  analyses  tend  to  be 
less  straightforward.  Dissociation  of  polypeptide  and  short  oligonucleotide  ions 
tends  to  follow  well-defined  pathways,  primarily  occurring  along  the  backbone.  This 
conveniently  generates  structurally  diagnostic  fragments  from  which  the  sequence 
of  the  intact  biopolymer  can  be  derived.  By  contrast,  dissociation  of  carbohydrate 
ions  frequently  leads  to  much  more  complex  fragmentation  patterns.  Chemical  bond 
fission  commonly  occurs  not  only  between  saccharide  units  but  also  across  the  gly- 
cosidic  ring  [37],  and  multiple  rearrangement  pathways  are  available  to  activated 
species  that  render  analysis  of  tandem  MS  data  extremely  complex.  Further  compli- 
cation arises  due  to  the  fact  that  unlike  polypeptides  and  oligonucleotides,  polysac- 
charides in  general  are  not  linear  polymers,  and  the  presence  of  multiple  branching 
points  makes  the  interpretation  of  MS/MS  data  a  challenging  task.  Data  analysis  can 
be  simplified  by  inducing  fragmentation  of  polysaccharide  ions  with  low-energy 
collisional  activation,  which  typically  leads  to  dissociation  of  glycosidic  bonds, 
while  leaving  the  rings  intact.  Fragmentation  processes  are  also  strongly  influenced 
by  the  nature  of  the  parent  ion  (alkali  metal  cationized  species  produce  different 
fragmentation  patterns  compared  to  protonated  species).  Additional  information  can 
be  also  gained  by  using  various  chemical  derivatization  techniques. 

Glycopeptides  are  another  area  of  great  interest  and  their  structural  analysis 
entails  localization  of  glycosylation  sites  within  the  polypeptide  chain  in  addition  to 
structural  studies  of  the  carbohydrate  moieties.  Glycosylation  site  analysis  is  typi- 
cally carried  out  by  identifying  glycopeptides  among  proteolytic  fragments  (e.g.,  by 
comparing  peptide  maps  for  intact  and  de-glycosylated  protein).  If  peptide  mapping 
of  de-glycosylated  protein  is  not  feasible  (e.g.,  due  to  poor  solubility  of  the 
carbohydrate-free  form  of  the  protein),  glycopeptides  can  be  identified  in  the  digest 
of  intact  glycoprotein  by  observing  characteristic  losses  (e.g.,  162  Da  for  hexose  resi- 
dues) in  survey  MS/MS  spectra  obtained  with  low-energy  CID  of  peptide  ions,  since 
the  labile  nature  of  glycosidic  bonds  in  the  gas  phase  leads  to  their  facile  dissociation 
(vide  supra).  Precise  localization  of  glycosylation  sites  can  be  accomplished  with 
electron-based  ion  fragmentation  techniques,  as  they  preferentially  cleave  peptide 
backbone,  leaving  the  carbohydrate  chains  mostly  intact  [38].  Complete  determina- 
tion of  structure  (especially  with  novel  glycans)  frequently  requires  the  use  of 
orthogonal  methods,  such  as  NMR  and  X-ray  crystallography  in  addition  to  MS  [39]. 


7.3    Analysis  of  Higher  Order  Structure  with  MS  Tools 

The  ability  of  various  MS -based  techniques  to  examine  covalent  structure  of  proteins, 
other  biopolymers  and  their  derivatives  also  makes  them  indispensable  in  the  studies 
of  the  higher  order  structure  and  conformational  dynamics  of  such  macromolecular 
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systems,  which  rely  on  various  chemical  probes  (such  as  chemical  labeling  and  cross- 
linking  studies,  to  be  considered  later  in  this  section).  Furthermore,  the  unique  ability 
of  ESI  to  generate  biomolecular  ions  directly  from  solutions  kept  under  physiologi- 
cally relevant  conditions  provides  other  opportunities  to  examine  higher  order  archi- 
tecture, dynamics  and  interactions  of  biopolymers,  as  detailed  below. 


7.3.1    Direct  ESI  MS  Measurements:  Characterization 
of  Non-covalent  Interactions  by  ESI  MS 

Both  ESI  and  MALDI  are  rightfully  credited  as  being  soft  ionization  techniques, 
since  they  allow  intact  biopolymers  to  be  transferred  from  a  condensed  phase  to  the 
vacuum  without  damaging  their  covalent  structure.  Furthermore,  it  was  recognized 
soon  after  the  introduction  of  these  techniques  into  the  mainstream  of  bioanalysis 
that  ESI  is  also  capable  of  generating  ions  representing  intact  non-covalent  macro- 
molecular  complexes  if  the  transition  from  solution  to  the  gas  phase  is  carried  out 
under  mild  desolvation  conditions  in  the  ESI  MS  interface.  The  two  parameters  that 
are  most  critical  for  the  survival  of  non-covalent  complexes  upon  this  transition  are 
the  ESI  interface  temperature  and  the  electrical  field  in  the  ion  desolvation  region, 
which  determines  the  average  kinetic  energy  of  ions  undergoing  frequent  collisions 
with  neutral  molecules  in  this  region.  Keeping  these  parameters  at  relatively  low 
levels  allows  the  composition  and  stoichiometry  of  macromolecular  assemblies  to 
be  determined  reliably  and  with  minimal  sample  consumption  (Fig.  7.10).  Not  only 
can  such  experiments  provide  information  on  the  stoichiometry  of  multi-protein 
complexes  [40-44],  but  they  may  also  reveal  the  presence  of  smaller  ligands  (e.g., 
metal  ions  and  small  organic  molecules)  within  these  non-covalent  assemblies  (see 
the  right-hand  panel  in  Fig.  7.10). 

Reducing  the  efficiency  of  ion  desolvation  to  ensure  the  survival  of  non-covalent 
complexes  in  ESI  MS  is  needed  in  order  not  only  to  avoid  collisional  excitation  of 
these  species  in  the  gas  phase  but  also  to  preserve  a  layer  of  residual  solvent  mole- 
cules and  small  counterions,  which  are  often  critical  for  the  survival  of  large  macro- 
molecular  complexes  in  the  gas  phase  [45,  46].  A  frequent  (and  unfortunate) 
consequence  of  less-than-optimal  ion  desolvation  in  ESI  MS  interface  is  a  decrease 
of  the  accuracy  of  mass  measurements,  a  problem  that  can  be  dealt  with  very  effec- 
tively by  supplementing  mild  ESI  MS  measurements  with  those  carried  out  under 
harsher  conditions  [47].  Although  the  latter  step  leads  to  partial  dissociation  of  non- 
covalent  complexes  in  the  gas  phase  (Fig.  7.11),  the  surviving  assemblies  have 
lower  residual  solvation,  and  a  stepwise  increase  of  the  electrostatic  field  in  the 
interface  region  eventually  results  in  dissociation  of  cof actors  from  the  subunits, 
thereby  allowing  low  molecular  weight  species  present  in  each  subunit  to  be  identi- 
fied and  the  stoichiometry  established. 

The  ability  of  ESI  MS  to  preserve  non-covalent  interactions  has  been  used  in  the 
past  two  decades  in  numerous  studies  aimed  at  establishing  quaternary  structure  of 
protein  complexes  [48].  These  range  from  relatively  modest  structures  to  large 
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Fig.  7.10  ESI  mass  spectra  of  the  apo-  (gray  trace)  and  holo-  (black  trace)  forms  of  a  regulatory 
metalloprotein  NikR  from  E.  coli.  The  main  panel  shows  the  spectra  acquired  under  near-native 
conditions,  when  both  forms  assume  a  tetrameric  structure,  while  the  denaturing  conditions  (inset 
on  the  left-hand  side)  result  in  complete  loss  of  the  physiologically  relevant  quaternary  structure 
and  reveal  only  the  presence  of  monomeric  polypeptide  chains.  The  detailed  view  of  ionic  peaks 
of  NikR  at  charge  state  +16  (inset  on  the  right-hand  side)  shows  the  mass  difference  between  the 
ions  representing  the  apo-  and  holo-forms,  revealing  the  presence  of  a  single  metal  ion  in  each 
protein  subunit  of  holo-NikR 
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Fig.  7.11  ESI  mass  spectra  of  bovine  hemoglobin  acquired  under  very  mild  desolvation  condi- 
tions (gray  trace)  to  preserve  all  non-covalent  complexes  and  with  elevated  collisional  activation 
in  the  ESI  interface  (black  trace)  to  enhance  ionic  desolvation.  Note  the  mass  shifts  of  ionic  peaks 
corresponding  to  tetramers  (oc*p*)2  and  dimers  oc*p*  due  to  removal  of  a  substantial  fraction  of 
residual  solvent  molecules.  Products  of  gas  phase  fragmentation  are  indicated  with  white  circles 
(not  observed  under  the  mild  desolvation  conditions) 
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Fig.  7.12  ESI  MS 

monitoring  of  acid-induced 
dissociation  and  unfolding  of 
homo-dimeric  hemoglobin 
from  Scapharca  (data 
courtesy  of  Prof.  Wendell 
P.  Griffith,  University  of 
Toledo) 
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macromolecular  assemblies  whose  molecular  weight  exceeds  1  MDa,  such  as  intact 
ribosomes  [49]  and  viral  capsids  [43].  This  approach  has  also  been  extremely  suc- 
cessful in  probing  other  types  of  physiologically  relevant  non-covalent  interactions, 
such  as  protein-receptor  binding  [50].  ESI  MS  can  also  be  used  to  monitor  changes 
in  the  composition  of  non-covalent  associations  in  response  to  environmental  fac- 
tors (such  as  solvent  composition,  protein  concentration,  etc.).  This  is  illustrated  in 
Fig.  7.12  with  acid-induced  dissociation  of  dimeric  hemoglobin  from  a  mollusk 
Scapharca,  where  the  onset  of  subunit  dissociation  clearly  manifests  itself  via  the 
appearance  of  the  ionic  signal  representing  globin  monomers.  Consequent  dissocia- 
tion of  the  heme  group  from  the  polypeptide  chain  is  manifested  by  a  mass  shift  of 
globin  monomer  ions  corresponding  to  a  loss  of  ca.  617  Da.  Early  stages  of  protein 
aggregation  can  also  be  monitored  by  ESI  MS,  e.g.,  by  observing  appearance  of 
oligomeric  protein  ions  in  ESI  MS  in  response  to  heat  stress  [51]. 


7.3.2    Ionic  Charge  State  Distribution  as  an  Indicator 
of  Protein  Compactness  in  Solution 

So  far,  our  discussion  has  been  focused  solely  on  changes  of  the  ionic  mass  in  ESI 
MS  as  an  indicator  of  the  changes  in  the  protein  architecture  in  solution.  However, 
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careful  examination  of  ESI  MS  data  presented  in  Fig.  7.12  reveals  another  inter- 
esting phenomenon  in  addition  to  the  dimer-to-monomer  transition  triggered  by 
the  acidification  of  the  protein  solution.  Unlike  the  charge  state  distribution  of 
dimer  ions  (oc*)2,  which  remains  narrow  and  contains  only  three  charge  states 
(+1 1,  +12,  and  +13)  as  long  as  the  dimer  ions  can  be  detected  in  the  mass  spectra, 
the  charge  state  distributions  of  the  monomer  ions  (both  with  and  without  the 
heme  group,  a*  and  a)  evolve  as  the  solvent  conditions  change  and  become  very 
convoluted  below  pH  5.  The  distributions  of  ionic  charges  of  both  of  these  spe- 
cies at  pH  4  are  bimodal,  a  feature  that  is  usually  attributed  to  the  coexistence  of 
two  or  more  protein  conformations  in  solution  [9].  Native  or  near-native  protein 
structures  are  usually  very  compact,  and  they  can  accommodate  only  a  limited 
number  of  charges  upon  their  transfer  from  solution  to  the  gas  phase.  At  the  same 
time,  even  partial  unfolding  of  a  polypeptide  chain  results  in  an  increase  of  the 
solvent-accessible  surface  area,  which  allows  a  significantly  higher  number  of 
charges  to  be  accommodated  by  the  protein  upon  its  transfer  to  the  gas  phase. 
Native  and  nonnative  protein  states  often  coexist  at  equilibrium  under  mildly 
denaturing  conditions;  in  such  situations  protein  ion  charge  state  distributions  in 
ESI  MS  become  bimodal  (as  can  be  seen  in  the  two  top  panels  in  Fig.  7.12), 
reflecting  the  presence  of  both  native  and  denatured  states.  Therefore,  dramatic 
changes  of  protein  charge  state  distributions  often  serve  as  gauges  of  large-scale 
conformational  changes. 

The  less  compact  the  protein  becomes,  the  higher  the  extent  of  multiple  charging 
of  the  ions  representing  these  conformers  in  ESI  MS:  as  can  be  seen  in  Fig.  7.12, 
continuing  acidification  of  the  protein  solution  results  in  expansion  of  the  charge 
state  envelope  of  globin  monomers  (e.g.,  the  mass  spectrum  acquired  at  pH  3  con- 
tains charge  states  +25  and  higher,  which  are  not  present  in  the  spectrum  acquired 
at  pH  4).  This  behavior  may  be  indicative  of  the  presence  of  several  nonnative  con- 
formers  in  solution;  however,  making  a  distinction  between  the  contributions  made 
by  such  (partially)  unfolded  species  to  the  total  ionic  signal  is  not  very  straightfor- 
ward. Therefore,  changes  in  the  protein  ion  charge  state  distributions  are  frequently 
regarded  as  qualitative  indicators  of  re-  or  denaturation  that  do  not  provide  much 
information  beyond  loss  or  gain  of  the  native  fold. 

This  problem  can  be  addressed  at  least  in  some  cases  using  a  procedure  that 
utilizes  chemometric  tools  to  extract  semiquantitative  data  on  multiple  protein 
conformational  isomers  coexisting  in  solution  under  equilibrium  [52,  53]. 
Experiments  are  carried  out  by  acquiring  an  array  of  spectra  over  a  range  of  both 
near-native  and  denaturing  conditions  to  ensure  adequate  sampling  of  various  pro- 
tein states  and  significant  variation  of  their  respective  populations  within  the 
range  of  experimental  conditions.  The  total  number  of  protein  conformers  sam- 
pled in  the  course  of  the  experiment  can  be  determined  by  subjecting  the  set  of 
collected  spectra  to  singular  value  decomposition,  SVD  [54].  The  ionic  contribu- 
tions of  each  conformer  to  the  total  signal  can  then  be  determined  by  using  a 
supervised  minimization  routine.  Application  of  this  method  to  several  small 
model  proteins  has  yielded  a  picture  of  protein  behavior  consistent  with  that  based 
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on  the  results  of  earlier  studies  that  utilized  a  variety  of  orthogonal  biophysical 
approaches  [53,  55-59]. 


7.3.3    Hydrogen/Deuterium  Exchange  MS 

Perhaps  one  of  the  most  popular  and  powerful  MS-based  experimental  tools  that  is 
now  widely  used  to  study  protein  architecture  and  conformational  dynamics  is 
hydrogen-deuterium  exchange  (HDX).  The  analytical  value  of  HDX  as  a  tool  for 
probing  macromolecular  structure  was  recognized  almost  immediately  after  the  dis- 
covery of  deuterium  [60]  and  subsequent  development  of  the  methods  of  production 
of  heavy  water.  Initial  studies  of  the  exchange  reactions  between  organic  molecules 
and  2H20  carried  out  by  Bonhoeffer  and  colleagues  indicated  that  the  exchange  rate 
is  very  high  for  hetero-atoms  (e.g.,  -OH  groups),  while  the  hydrogen  atoms  attached 
to  carbon  atoms  (e.g.,  -CH3  groups)  do  not  undergo  exchange  [61].  As  early  as  mid- 
1950s,  Hvidt  and  Linderstr0m-Lang  used  HDX  exchange  to  measure  solvent  acces- 
sibility of  labile  hydrogen  atoms  as  a  probe  of  polypeptide  structure  [62,  63],  and 
Burley  et  al.  suggested  that  the  extent  of  deuterium  incorporation  into  a  protein 
molecule  can  be  measured  by  monitoring  its  mass  increase  [64].  However,  it  was 
not  until  much  later  that  the  advent  of  ESI  and  MALDI  MS  dramatically  expanded 
the  range  of  biopolymers  for  which  the  extent  of  deuterium  incorporation  could  be 
measured  by  monitoring  the  protein  mass  evolution  directly  under  a  variety  of  con- 
ditions [65]. 

While  MS  is  not  the  only  means  of  detection  that  can  be  used  for  HDX  measure- 
ments (high-resolution  NMR  is  another  popular  choice),  MS  does  offer  several 
important  advantages,  namely  faster  time  scale,  tolerance  to  high-spin  ligands  and 
cof actors,  ability  to  monitor  the  exchange  in  a  conformer-specific  fashion,  as  well 
as  much  more  forgiving  molecular  weight  limitations.  The  ability  of  MS  to  handle 
larger  proteins  and  their  complexes  is  particularly  important  when  compared  to 
high-field  NMR,  which  still  has  limited  application  for  proteins  larger  than  ca. 
30  kDa.  Another  significant  advantage  offered  by  ESI  MS  is  its  superior  sensitivity, 
which  allows  many  experiments  to  be  carried  out  using  only  minute  quantities  of 
proteins. 

7.3.3.1    Basic  Principles  of  Protein  HDX 

HDX  targets  all  labile  hydrogen  atoms  (i.e.,  those  attached  to  nitrogen  atoms  at  the 
backbone  amides  and  heteroatoms  at  polar/charged  side  chains),  although  many 
labile  hydrogen  atoms  would  not  readily  undergo  HDX  due  to  their  involvement  in 
hydrogen  bonding  network  or  sequestration  from  the  solvent  in  the  protein  interior. 
Therefore,  protein  HDX  involves  two  different  types  of  reactions:  (1)  reversible  pro- 
tein unfolding  that  disrupts  the  H-bonding  network  and/or  exposes  buried  segments 
to  solvent  and  (2)  isotope  exchange  at  individual  unprotected  sites.  Since  protein 


7    Mass  Spectrometry 


237 


Fig.  7.13  Intrinsic  exchange 
rates  of  several  types  of  labile 
hydrogen  atoms  as  functions 
of  solution  pH.  Reproduced 
with  permission  from  [132] 
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unfolding  (either  local  or  global)  is  a  prerequisite  for  exchange  at  the  sites  that  are 
protected  in  the  native  conformation,  HDX  reactions  serve  as  a  reliable  and  sensitive 
indicator  of  the  unfolding  events  {protection  means  either  involvement  in  the  hydro- 
gen bonding  network  or  sequestration  from  solvent  in  the  protein  core).  However, 
conformation  and  dynamics  are  not  the  only  determinants  of  the  HDX  kinetics. 
Even  in  the  absence  of  any  protection,  the  exchange  kinetics  of  a  labile  hydrogen 
atom  is  strongly  dependent  on  the  nature  of  the  functional  group.  Furthermore,  the 
exchange  rate  is  strongly  influenced  by  a  variety  of  extrinsic  factors,  most  notably 
solution  pH  and  temperature,  and  the  intrinsic  rate  constant  can  be  expressed  as  [66] 

Kt=K^]  +  kbax[OH-]  +  k^  (7-4) 


The  pH  dependence  of  the  cumulative  intrinsic  exchange  rate  for  several  types  of 
labile  hydrogen  atoms,  calculated  based  on  the  data  compiled  by  Dempsey  [66]  is 
presented  in  Fig.  7.13. 

Backbone  amide  hydrogen  atoms  constitute  a  particularly  interesting  class  of 
labile  hydrogen  atoms  due  to  their  uniform  distribution  throughout  the  protein 
sequence,  which  makes  them  very  convenient  reporters  of  protein  dynamics  at  the 
amino  acid  residue  level  (proline  is  the  only  naturally  occurring  amino  acid  lacking 
an  amide  hydrogen  atom).  Therefore,  it  is  not  surprising  that  the  majority  of  HDX 
MS  experiments  are  concerned  with  the  exchange  of  the  backbone  amide  hydrogen 
atoms.  The  mathematical  formalism  that  is  often  used  to  describe  HDX  kinetics  of 
backbone  amides  was  introduced  several  decades  ago  and  is  based  upon  a  simple 
two- state  kinetic  model  [67]: 

^op  ^int 

NYL(protected)^±NYL(unprotected)^ND (unprotected)  ^  NT) (protected),  (7.5) 
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where  kop  and  kcX  are  the  rate  constants  for  the  opening  (unfolding)  and  closing 
(refolding)  events  that  expose/protect  a  particular  amide  hydrogen  to/from  exchange 
with  the  solvent. 

In  most  HDX  studies  the  exchange-incompetent  state  of  the  protein  is  considered 
to  be  its  native  state.  The  exchange-competent  state  is  thought  of  as  a  nonnative  struc- 
ture, which  can  be  either  fully  unfolded  (random  coil)  or  partially  unfolded  (interme- 
diate states).  Alternatively,  it  can  represent  a  structural  fluctuation  within  the  native 
conformation,  which  exposes  an  otherwise  protected  amide  hydrogen  to  solvent  tran- 
siently through  local  unfolding  or  "structural  breathing"  without  large-scale  structure 
loss  [68,  69].  Transitions  between  different  nonnative  states  under  equilibrium  condi- 
tions are  usually  ignored  in  mathematical  treatments  of  HDX,  since  the  majority  of 
HDX  measurements  are  carried  out  under  native  or  near-native  conditions. 


7.3.3.2    Global  HDX  MS  Measurements 

HDX  MS  measurements  can  provide  information  on  global  protection  by  measur- 
ing the  deuterium  content  of  the  entire  protein,  rather  than  the  exchange  kinetics  of 
individual  amide  hydrogen  atoms  (as  done  by  high-resolution  HDX  NMR).  Still, 
interpretation  of  HDX  MS  data  often  utilizes  the  kinetic  model  (7.5)  by  making  an 
implicit  assumption  that  NH(protected)  and  NH(unprotected)  represent  groups  of 
amides,  rather  than  individual  amides  that  become  unprotected  upon  transition  from 
one  state  to  another.  Two  extreme  cases  are  usually  considered:  a  situation  when 
^ci^^int  and  ^ci^^int-  The  former  case  (referred  to  as  the  EX2  exchange  regime)  is 
commonly  observed  under  native  or  near-native  conditions,  when  each  unfolding 
event  is  very  brief,  and  its  lifetime  (IV&ci)  is  much  shorter  than  the  characteristic  time 
of  exchange  of  an  unprotected  labile  hydrogen  atom  (l/kint).  In  this  case  the  proba- 
bility of  exchange  for  even  a  single  amide  during  an  unfolding  event  will  be  very 
low,  and  the  overall  rate  of  exchange  will  be  defined  by  both  the  frequency  of 
unfolding  events  (kop)  and  the  probability  of  exchange  during  a  single  opening  event: 

kHDX=KP{Kt/kcl)  =  kmt-K,  (7.6) 

where  K  is  an  effective  equilibrium  constant  for  the  unfolding  reaction,  which  is 
determined  by  the  free  energy  difference  between  the  two  states  of  the  protein.  The 
overall  exchange  rate  constant  &HDX  in  this  case  is  a  cumulative  rate  of  exchange,  i.e., 
an  ensemble-averaged  rate  of  deuterium  incorporation  into  a  molecule,  and  is  mea- 
sured as  a  mass  shift  of  the  isotopic  cluster  of  a  protein  ion  as  a  function  of  HDX  time. 

The  opposite  extreme  (kci<^kint)  is  observed  either  when  the  protein  is  placed 
under  denaturing  conditions  (which  dramatically  decreases  the  refolding  rate  &d),  or 
by  increasing  the  intrinsic  exchange  rate  (e.g.,  by  elevating  the  protein  solution 
pH — see  Fig.  7.13).  As  a  result,  the  lifetime  of  the  unprotected  states  become  long 
enough  to  allow  all  exposed  labile  hydrogen  atoms  to  be  exchanged  during  a  single 
unfolding  event.  In  this  case  (commonly  referred  to  as  the  EX1  exchange  regime) 
the  exchange  rate  will  be  determined  simply  by  the  rate  of  protein  unfolding: 
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^  =  Kv  (7.7) 

HDX  MS  measurements  carried  out  under  the  EX1  conditions  typically  give  rise 
to  bi-  or  multimodal  isotopic  distributions,  where  the  deuterium  content  of  each  part 
reflects  the  backbone  protection  levels  of  distinct  protein  conformers.  This  gives 
HDX  MS  the  unique  ability  to  visualize  and  track  multiple  protein  states  that  may 
coexist  in  solution  under  equilibrium  [70] . 


7.3.3.3    Local  HDX  MS  Measurements 

Replacement  of  each  hydrogen  with  a  deuteron  (or  vice  versa)  results  in  a  protein 
mass  change  of  about  1  Da,  which  makes  MS  a  very  sensitive  and  reliable  detector 
of  the  progress  of  protein  HDX  reactions.  Mass  measurements  of  proteins  undergo- 
ing HDX  are  usually  carried  out  following  rapid  acidification  of  the  protein  solution 
to  pH  2.5-3  and  lowering  the  temperature  to  0-4  °C,  which  results  in  significant 
deceleration  of  the  chemical  (intrinsic)  exchange  rates  of  backbone  amide  hydrogen 
atoms  (see  Fig.  7.13).  These  conditions,  known  as  HDX  quenching  or  slow  exchange 
conditions,  also  result  in  unfolding  of  most  proteins.  Since  the  intrinsic  exchange 
rates  of  labile  side  chain  hydrogen  atoms  are  not  decelerated  as  significantly  as 
those  for  backbone  amides,  all  information  on  the  side  chain  protection  is  generally 
lost  during  this  step,  leaving  a  single  HDX  reporter  for  each  amino  acid  residue 
(again,  with  the  exception  of  proline  residues).  Another  fortunate  consequence  of 
quench-induced  protein  denaturation  is  dissociation  of  all  non-covalently  bound 
ligands  (ranging  from  metal  cations  and  small  organic  molecules  to  other  biopoly- 
mers)  from  the  protein.  Therefore,  measuring  the  protein  mass  under  these  condi- 
tions provides  information  only  on  the  protein  conformation  and  stability,  rather 
than  composition  of  non-covalent  complexes  formed  by  the  protein  and  its  ligands. 
In  addition  to  characterizing  protein  conformation  and  stability  globally,  the  protein 
can  be  digested  with  an  acidic  protease  (e.g.,  pepsin)  under  the  slow  exchange  con- 
ditions, and  MS  (usually  following  quick  desalting  and  fast  LC  separation)  can  be 
used  to  measure  the  deuterium  content  of  each  proteolytic  fragment.  This  produces 
information  on  protein  conformation  and  dynamics  at  the  local  level.  A  typical 
workflow  diagram  of  an  HDX  MS  experiment  is  shown  in  Fig.  7.14. 

Spatial  resolution  offered  by  HDX  MS  is  usually  limited  only  by  the  extent  of 
proteolysis,  which  (along  with  other  sample-handling  steps)  must  be  performed 
relatively  quickly  under  the  slow  exchange  conditions  to  avoid  occurrence  of  sig- 
nificant back-exchange  prior  to  MS  measurements  of  the  deuterium  content  of  indi- 
vidual peptide  fragments.  In  general,  a  large  number  of  proteolytic  fragments, 
particularly  overlapping  ones,  would  lead  to  greater  spatial  resolution,  and  hence 
more  precise  localization  of  the  structural  regions  which  have  undergone  exchange. 
In  some  cases,  this  may  allow  the  backbone  amide  protection  patterns  to  be  deter- 
mined at  single-residue  resolution  [71],  although  such  instances  remain  very  rare. 
Supplementation  of  enzymatic  digestion  with  peptide  ion  fragmentation  in  the  gas 
phase  may  also  enhance  the  spatial  resolution  of  HDX  MS  measurements  [72],  but 
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Fig.  7.14  Schematic  representation  of  HDX  MS  work  flow  to  examine  protein  higher  order  struc- 
ture and  conformational  dynamics.  The  exchange  is  initiated  by  placing  the  unlabeled  protein  into 
a  D20-based  solvent  system  (e.g.,  by  a  rapid  dilution).  Unstructured  and  highly  dynamic  protein 
segments  undergo  fast  exchange  (blue  and  red  colors  represent  protons  and  deuterons,  respec- 
tively). Following  the  quench  step  (rapid  solution  acidification  and  temperature  drop),  the  protein 
loses  its  native  conformation,  but  the  spatial  distribution  of  backbone  amide  protons  and  deuterons 
across  the  backbone  is  preserved  (all  labile  hydrogen  atoms  at  side  chains  undergo  fast  back- 
exchange  at  this  step).  Rapid  clean-up  followed  by  MS  measurement  of  the  protein  mass  reports 
the  total  number  of  backbone  amide  hydrogen  atoms  exchanged  under  native  conditions  (a  global 
measure  of  the  protein  stability  under  native  conditions),  as  long  as  the  quench  conditions  are 
maintained  during  the  sample  work-up  and  measurement.  Alternatively,  the  protein  can  by  digested 
under  the  quench  conditions  using  acid-stable  protease(s),  and  LC/MS  analysis  of  masses  of  indi- 
vidual proteolytic  fragments  will  provide  information  on  the  backbone  protection  of  corresponding 
protein  segments  under  the  native  conditions.  Reproduced  with  permission  from  [133] 


this  technique  has  yet  to  be  commonly  accepted  due  to  concerns  over  the  possibility 
of  introducing  gas  phase  artifacts  [73].  In  addition  to  limited  spatial  resolution, 
HDX  MS  measurements  frequently  suffer  from  incomplete  sequence  coverage, 
especially  when  applied  to  larger  and  extensively  glycosylated  proteins.  Proteins 
with  multiple  disulfide  bonds  constitute  another  class  of  targets  for  which  adequate 
sequence  coverage  is  difficult  to  achieve,  although  certain  changes  in  experimental 
protocol  can  alleviate  this  problem,  at  least  for  smaller  proteins  [74] .  Typically,  an 
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Fig.  7.15  Localization  of  the  receptor  binding  interface  on  the  surface  of  human  serum  transferrin 
(Tf)  with  HDX  MS.  Left  panel:  HDX  MS  of  Tf  (global  exchange)  in  the  presence  (blue)  and  the 
absence  (red)  of  the  receptor.  The  exchange  was  carried  out  by  diluting  the  protein  stock  solution 
1:10  in  exchange  solution  (100  mM  NH4HC03  in  D20,  pH  adjusted  to  7.4)  and  incubating  for  a 
certain  period  of  time  as  indicated  on  each  diagram  followed  by  rapid  quenching  (lowering  pH  to 
2.5  and  temperature  to  near  0  °C).  The  black  trace  shows  unlabeled  protein.  Right  panel:  isotopic 
distributions  of  representative  peptic  fragments  derived  from  Tf  subjected  to  HDX  in  the  presence 
(blue)  and  the  absence  (red)  of  the  receptor  and  followed  by  rapid  quenching,  proteolysis,  and  LC/ 
MS  analysis.  Dotted  lines  indicate  deuterium  content  of  unlabeled  and  fully  exchanged  peptides. 
Colored  segments  within  the  Tf/receptor  complex  show  localization  of  these  peptic  fragments 
(based  on  the  low-resolution  structure  of  the  complex).  Adapted  with  permission  from  [73] 


80  %  level  of  sequence  coverage  is  considered  good,  although  significantly  lower 
levels  may  also  be  adequate,  depending  on  the  context  of  the  study. 

An  example  of  using  HDX  MS  to  probe  protein  conformation  and  dynamics,  as 
well  as  to  identify  binding  interface  regions  in  a  protein/receptor  complex  is  shown 
in  Fig.  7.15,  where  hydrogen  exchange  kinetics  are  measured  for  a  diferric  form  of 
human  serum  transferrin  (Fe2Tf)  alone  and  in  complex  with  its  cognate  receptor. 
Both  Tf-metal  and  Tf-receptor  complexes  dissociate  under  the  slow  exchange  con- 
ditions prior  to  MS  analysis;  therefore,  the  protein  mass  evolution  in  each  case 
reflects  solely  deuterium  uptake  in  the  course  of  exchange  in  solution  (left  panel  in 
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Fig.  7.15).  The  extra  protection  afforded  by  the  receptor  binding  to  Tf  persists  over 
an  extended  period  of  time,  and  it  may  be  tempting  to  assign  it  to  shielding  of  labile 
hydrogen  atoms  at  the  protein-receptor  interface.  However,  this  view  is  overly  sim- 
plistic, as  the  conformational  effects  of  protein  binding  are  frequently  felt  well 
beyond  the  interface  region.  The  difference  in  the  backbone  protection  levels  of 
receptor-free  and  receptor-bound  forms  of  Fe2Tf  appears  to  grow  during  the  initial 
hour  of  exchange,  reflecting  significant  stabilization  of  Fe2Tf  higher  order  structure 
by  the  bound  receptor.  Indeed,  while  the  fast  phase  of  HDX  is  often  ascribed  to 
frequent  local  fluctuations  (transient  perturbations  of  higher  order  structure)  affect- 
ing relatively  small  protein  segments,  the  slower  phases  of  HDX  usually  reflect  rela- 
tively rare,  large-scale  conformational  transitions,  such  as  transient  partial  or 
complete  protein  unfolding  [75]. 

Evolution  of  the  deuterium  content  of  various  peptic  fragments  of  Fe2Tf  (right 
panel  in  Fig.  7.15)  reveals  a  wide  spectrum  of  protection,  which  is  distributed  very 
unevenly  across  the  protein  sequence.  While  some  peptides  exhibit  nearly  complete 
protection  of  backbone  amides  (e.g.,  segment  [396-408]  sequestered  in  the  core  of 
the  protein  C-lobe),  exchange  in  many  others  is  fast  (e.g.,  peptide  [612-621]  in  the 
solvent-exposed  loop  of  the  C-lobe).  The  influence  of  receptor  binding  on  backbone 
protection  is  also  highly  localized.  While  most  segments  appear  to  be  unaffected  by 
the  receptor  binding,  there  are  several  regions  where  exchange  kinetics  are  notice- 
ably decelerated  (e.g.,  segment  [71-81]  of  the  N-lobe,  which  contains  several  amino 
acid  residues  that  form  the  Tf/receptor  interface  according  to  the  available  model  of 
the  complex  based  on  low-resolution  cryo-EM  data  [76]). 

7.3.3.4    Local  HDX  MS  Measurements  Using  a  Top-Down  Approach 

An  alternative  method  to  probe  HDX  kinetics  locally  that  does  not  require  proteo- 
lytic fragmentation  prior  to  MS  analysis  takes  advantage  of  the  ability  of  modern 
mass  spectrometers  to  produce  a  wealth  of  structural  information  in  tandem  (MS/ 
MS)  experiments  at  the  protein  level  (the  top-down  approach  to  protein  sequencing 
discussed  in  Sect.  7.2.1.2).  One  unique  advantage  of  the  top-down  HDX  MS  mea- 
surements that  cannot  be  matched  by  the  classic  bottom-up  type  experiments  is  the 
ability  to  obtain  protection  patterns  in  a  conformer-specific  fashion.  This  can  be 
accomplished  by  fragmenting  subpopulations  of  protein  ions,  which  are  mass 
selected  to  include  species  with  deuterium  content  representative  of  a  certain  pro- 
tein conformer  (this,  of  course,  can  be  accomplished  only  under  conditions  favoring 
EX1  exchange  regime  in  solution,  so  that  different  protein  conformers  can  be  visu- 
alized based  on  different  levels  in  deuterium  incorporation). 

Despite  the  great  promise  of  top-down  HDX  MS  [73],  applications  of  this  tech- 
nique have  been  limited  so  far  due  to  concerns  over  the  possibility  of  hydrogen 
scrambling  accompanying  dissociation  of  protein  ions  in  the  gas  phase.  Several 
recent  studies  demonstrated  that  the  extent  of  scrambling  is  indeed  negligible  when 
ECD  [77]  or  ETD  [78]  is  used  as  a  means  of  generating  fragment  ions  in  top-down 
HDX  MS  experiments.  In  addition  to  allowing  hydrogen  scrambling  to  be 
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eliminated  in  the  top-down  HDX  MS  experiments,  both  ECD  and  ETD  appear  to  be 
superior  to  collisional  activation  in  terms  of  generating  a  larger  number  of  structur- 
ally diagnostic  ions  [79],  allowing  both  better  sequence  coverage  and  enhanced 
spatial  resolution  to  be  achieved.  In  fact,  in  some  cases  it  becomes  possible  to  gener- 
ate patterns  of  deuterium  distribution  across  the  protein  backbone  down  to  the  sin- 
gle-residue level  [77,  80]. 


7.3.4    Chemical  Cross-Linking  of  Proteins 

Chemical  cross-linking  is  a  classical  biochemical  technique  used  to  characterize 
protein  conformation,  and  it  benefits  tremendously  from  the  ability  of  modern  MS 
to  detect  and  identify  the  products  of  the  cross-linking  reactions.  Cross-linking 
reagents  are  generally  classified  based  on  their  chemical  specificity  and  the  length 
of  the  spacer  arm  (cross-bridge  formed  between  the  two  cross-linked  sites  when 
the  reaction  is  complete).  The  chemical  specificity  of  a  cross-linker  determines  the 
overall  pool  of  reactive  groups  within  the  polypeptide  that  may  participate  in  the 
cross-linking  reaction.  Eight  out  of  the  20  amino  acid  side  chains  are  chemically 
reactive  with  good  selectivity:  Arg  (guanidinyl),  Lys  (e-amine),  Asp  and  Glu  (p-  and 
y-carboxylates),  Cys  (sulfhydryl),  His  (imidazole),  Met  (thioether),  Trp  (indoyl), 
and  Tyr  (phenolic  hydroxylate)  [81],  although  virtually  no  reagent  is  absolutely 
group-specific. 

Monofunctional  (or  zero-length)  cross-linkers  induce  direct  coupling  of  two 
functional  groups  of  the  protein  without  incorporating  any  extraneous  material  into 
the  protein.  Obviously,  this  becomes  possible  only  if  the  two  functional  groups  are 
in  a  very  close  proximity  to  each  other,  in  which  case  the  cross-linker  operates  as  a 
condensing  agent,  resulting  in  the  cross-linked  residues  becoming  directly  inter- 
joined.  Bifunctional  cross-linkers,  on  the  other  hand,  contain  two  reagents  linked 
through  a  spacer  arm,  thus  allowing  the  coupling  of  functional  groups  whose  sepa- 
ration does  not  exceed  the  spacer's  length.  Bifunctional  reagents  are  further  subdi- 
vided into  homobifunctional  (i.e.,  both  cross-linking  groups  within  the  reagent 
targeting  the  same  reactive  groups  on  the  protein)  and  heterobifunctional  cross- 
linkers  (coupling  different  functional  groups  on  the  protein). 

Heterobifunctional  cross-linkers  may  incorporate  a  photosensitive  (nonspecific) 
reagent  in  addition  to  a  conventional  (group-specific)  functionality.  Such  photosen- 
sitive groups  react  indiscriminately  upon  activation  by  irradiation.  Once  the  specific 
end  of  such  a  cross-linker  is  anchored  to  an  amino  acid  residue,  the  photo-reactive 
end  can  be  used  to  probe  the  surroundings  of  this  amino  acid.  More  information  on 
chemical  cross-linkers  can  be  found  in  several  excellent  reviews  on  the  subject  [82- 
85]  and  an  outstanding  book  by  Wong  [81]. 

MS-assisted  cross-linking  studies  usually  aim  to  identify  the  pairs  of  cross-linked 
residues  within  the  protein  or  protein  complex.  Such  information  may  provide 
through- space  distance  constraints  that  are  extremely  valuable  for  defining  both  ter- 
tiary (intra-subunit  cross-links)  and  quaternary  (inter- subunit  cross-links)  organiza- 
tion of  the  protein  when  no  other  structural  information  is  available.  Confident 
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Fig.  7.16  A  schematic  diagram  of  workflow  of  cross-linking  a  multi-protein  complex  and  inte- 
grating the  levels  of  information  into  a  three-dimensional  model  of  the  structure.  Reprinted  with 
permission  from  [86] 


assignment  of  the  pairs  of  coupled  residues  within  the  cross-linked  protein(s)  is  a 
rather  challenging  experimental  task.  A  combination  of  proteolysis,  separation 
methods  (e.g.,  LC),  and  mass  spectrometry  (and,  particularly,  MS/MS)  provides 
perhaps  the  most  elegant  and  efficient  way  of  solving  this  problem  [84,  86,  87]. 
Figure  7.16  shows  a  workflow  of  a  typical  cross-linking  experiment.  Separation  of 
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proteolytic  fragments  prior  to  MS  analysis  usually  results  in  significant  improve- 
ments in  sensitivity  by  eliminating  possible  signal  suppression  effects  that  may  oth- 
erwise result  in  discrimination  against  larger  (cross-linked)  fragments  [86] .  Although 
peptide  mapping  alone  can  sometimes  lead  to  confident  identification  of  the  cross- 
linked  residues  [88-90],  unambiguous  assignment  of  cross-linked  peptides  requires 
that  MS/MS  sequencing  of  the  proteolytic  fragments  be  carried  out  [91,  92]. 

As  the  amount  of  information  deduced  from  cross-linking  experiments  increases, 
so  does  the  complexity  of  data  interpretation,  and  the  tools  of  bioinformatics  become 
absolutely  essential  to  interpret  the  results  of  cross-linking  experiments  [93].  The 
task  of  assigning  the  cross-linked  peptides  and  localizing  the  modification  sites  can 
be  greatly  assisted  by  a  variety  of  automated  algorithms  that  use  MS  or  MS/MS  data 
as  input  [86,  93].  The  database  mining  approach  to  identification  of  cross-linked 
peptides  mentioned  earlier  in  this  section  [94]  can  be  used  even  in  a  situation  when 
the  protein  complex  composition  is  not  known  a  priori  [95].  More  sophisticated 
approaches,  such  as  Xlink-Identifier  [96],  allow  the  cross-linking  sites  to  be  local- 
ized with  high  precision  by  identifying  inter-  and  intra-peptide  cross-links  in  addi- 
tion to  dead-end  products  and  underivatized  peptides.  Another  comprehensive 
cross-linking  data  analysis  platform  is  MS-Bridge  [97],  which  is  part  of  the  Protein 
Prospector  MS  data  analysis  suite.  While  these  platforms  were  developed  to  support 
label-free  analyses,  several  other  algorithms  have  been  developed  to  take  advantage 
of  isotopically  tagged  cross-linkers  [98-101].  A  comprehensive  list  of  data  analysis 
programs  developed  for  interpretation  of  the  results  of  cross-linking  experiments 
can  be  found  in  a  recent  review  article  [87]. 


7.3.5    Chemical  Labeling 

Selective  chemical  modification  [102]  is  another  classical  biophysical  technique  that 
benefitted  tremendously  from  the  recent  progress  in  MS  hardware  and  methodol- 
ogy. The  unique  ability  of  MS  to  localize  both  shielded  and  modified  residues  within 
a  protein  molecule  transformed  the  chemical  labeling  technique  to  a  highly  efficient 
probe  of  higher  order  macromolecular  structure.  Most  chemical  modifications  of  an 
amino  acid  side  chain  alter  the  protein  mass,  hence  the  appeal  of  mass  spectrometry 
as  a  readout  tool  for  the  outcome  of  such  experiments.  Interpretation  of  the  MS  and 
MS/MS  data  on  chemically  modified  proteins  is  usually  relatively  straightforward 
(as  compared  to  the  analysis  of  cross-linked  proteins)  and  greatly  benefits  from  a 
vast  arsenal  of  experimental  tools  developed  to  analyze  PTM  of  proteins. 

In  a  typical  experiment,  protein  exposure  to  a  certain  chemical  probe  is  followed 
by  digestion  of  the  modified  protein  with  a  suitable  proteolytic  enzyme,  and  mass 
mapping  of  the  fragment  peptides.  The  position(s)  of  the  modified  residue(s)  within 
each  proteolytic  fragment  can  be  reliably  established  using  tandem  mass  spectrom- 
etry, as  the  presence  of  a  chemical  modification  manifests  itself  as  a  break  or  a  shift 
in  the  ladder  of  the  expected  fragment  ions.  Inter-subunit  binding  topology  is  usu- 
ally determined  by  comparing  modification  patterns  of  the  protein  obtained  in  the 
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presence  and  in  the  absence  of  its  binding  partner  [103],  although  the  two  experi- 
ments can  be  combined  if  the  labeling  agent  contains  a  stable  isotope  tag  [104].  An 
added  benefit  of  using  isotope  tags  is  the  easy  recognition  and  quantitation  of  label- 
containing  peptides  and  their  fragments  in  MS  and  MS/MS  spectra. 

In  addition  to  selective  chemical  labeling,  protein  conformation  can  also  be  char- 
acterized with  non-selective  labeling,  which  also  offers  an  additional  advantage  of 
being  able  to  determine  the  solvent  exposure  of  several  types  of  amino  acids  simul- 
taneously in  a  single  experiment.  So  far,  the  hydroxyl  radical  OH*  is  the  most  popu- 
lar nonspecific  modifier,  due  to  its  ability  to  induce  side  chain  oxidation  for  a  variety 
of  amino  acids  and  the  relative  ease  of  its  generation  in  solution.  Although  the 
hydroxyl  radical  is  relatively  nondiscriminatory,  and  can  modify  virtually  all  types 
of  amino  acid  side  chains  [105],  the  most  susceptible  to  OH*  attack  are  side  chains 
containing  sulfur  atoms  (Cys  and  Met),  including  disulfide-bonded  Cys  residues. 
The  least  susceptible  to  the  OH*  attack  are  Gly,  Asn,  Asp,  and  Ala,  whose  reactivity 
is  three  orders  of  magnitude  lower  than  that  of  Cys.  The  great  variety  of  OH-induced 
oxidation  products  and  the  large  number  of  potential  targets  place  a  premium  on  the 
ability  to  detect  and  identify  the  modification  sites.  Usually  proteolytic  degradation 
of  the  modified  protein  followed  by  LC/MS  and  MS/MS  analyses  is  needed  in  order 
to  achieve  reliable  identification  of  oxidatively  labeled  amino  acid  side  chains  [105- 
107].  As  is  the  case  with  the  analysis  of  the  results  of  chemical  cross-linking  experi- 
ments, extracting  useful  information  from  covalent  labeling  experimental  data 
greatly  benefits  from  automation  [108]. 

One  important  consideration  that  must  be  kept  in  mind  when  designing  or  inter- 
preting the  results  of  both  selective  chemical  and  nonselective  (oxidative)  labeling 
experiments  relates  to  the  fact  that  structural  information  derived  from  such  mea- 
surements is  reliable  only  if  the  protein  maintains  its  conformation  during  the  exper- 
iment [109].  Most  chemical  modifications  result  in  changing  the  charge  of  the 
labeled  amino  acid  residue,  and  a  significant  alteration  of  the  protein  surface  charge 
distribution  may  obviously  result  in  conformational  change.  Furthermore,  even  the 
sheer  size  of  many  groups  used  as  covalent  labels  may  interfere  with  the  protein's 
ability  to  maintain  its  conformation  by  creating  steric  constraints,  but  despite  the 
extreme  seriousness  of  this  concern,  less  than  half  of  all  studies  utilizing  selective 
chemical  labeling  that  were  conducted  in  the  past  decade  employed  any  means  of 
ensuring  the  integrity  of  protein  higher  order  structure  during  the  experiments  [109]. 
Artifacts  associated  with  the  influence  of  chemical  modifications  on  the  protein 
conformation  can  be  avoided  by  limiting  the  number  of  modifications  to  one  per 
protein  molecule  (in  this  way,  reactivity  of  any  amino  acid  side  chain  is  determined 
only  by  the  unperturbed  protein  structure  [109]).  While  the  extent  of  protein  modi- 
fication can  be  kept  low  to  minimize  conformational  perturbations  [106],  this  inevi- 
tably has  a  negative  impact  on  the  sensitivity  of  the  measurements.  A  very  elegant 
solution  to  this  problem  is  based  upon  the  realization  that  the  extent  of  artifacts 
introduced  by  chemical  labeling  depends  not  only  on  the  extent  of  protein  oxidation 
but  also  on  the  time  frame  of  the  oxidation  process  [1 10].  Should  this  reaction  time 
window  be  significantly  narrow  compared  to  the  time  scale  of  conformational 
changes  (sub-millisecond  range),  the  labeling  pattern  would  reflect  only  the  native 
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structure  of  the  protein,  even  if  the  number  of  modified  sites  on  each  protein  is  sig- 
nificant. These  considerations  form  the  basis  of  a  highly  successful  technique  called 
fast  photochemical  oxidation  of  proteins  (FPOP),  where  solvent-exposed  amino 
acid  residues  are  oxidized  by  OH*  radicals  produced  by  the  photolysis  of  H202. 
FPOP  is  designed  to  limit  protein  exposure  to  radicals  to  <1  |is  by  employing  a 
pulsed  laser  for  initiation  to  produce  the  radicals  and  a  radical- scavenger  to  limit 
their  lifetimes  [111]. 


7.3. 6   Higher  Order  Structure  of  Other  Biopolymers 

7.3.6.1    DNA  Higher  Order  Structure 

Until  very  recently,  there  was  substantially  less  interest  in  developing  MS-based 
methods  to  probe  higher  order  structure  of  DNA  molecules,  since  they  were  thought 
to  adopt  only  relatively  few  favored  conformations  (unlike  proteins).  Nevertheless, 
apart  from  the  Watson-Crick  double  helical  DNA  structure  (which  is  also  known  as 
the  B-form  DNA),  a  large  number  of  other  structures  have  been  shown  to  exist, 
which  either  differ  from  the  B  conformation  by  arrangement  of  the  two  strands  in 
the  double  helix  (the  so-called  A  and  Z  conformations),  or  by  incorporating  more 
than  just  two  strands  (e.g.,  triplexes  and  quadruplexes)  [112].  Several  of  these  non- 
classical  DNA  conformations  came  to  prominence  recently  either  due  to  their 
importance  in  designing  novel  therapeutic  strategies  [113]  or  for  their  potential  use 
in  nano-technological  applications,  e.g.,  as  scaffolds  of  building  blocks  in  molecu- 
lar devices  [114]. 

Similar  to  the  studies  of  protein  non-covalent  complexes  discussed  in  Sect.  7.3.1, 
ESI  MS  can  also  be  used  to  obtain  mass  spectra  of  intact  double- stranded  DNA 
[115],  as  well  as  tetramers  of  short  oligonucleotides  that  assemble  to  form 
G-quadruplex-like  structures  [116,  117].  Direct  ESI  MS  measurements  have  also 
been  successful  as  a  means  of  monitoring  DNA  interaction  with  small  ligands,  most 
notably  DNA-targeting  drugs.  Numerous  studies  have  been  published  where  this 
technique  was  employed  to  evaluate  not  only  the  stoichiometry  of  such  non-covalent 
complexes  but  also  their  binding  affinity  (reviewed  in  [118,  119]).  Information  on 
DNA  higher  order  structure  can  also  be  provided  by  using  selective  chemical  label- 
ing and  chemical  cross-linking  combined  with  MS  analysis  of  the  products,  a  tech- 
nique similar  to  those  discussed  in  Sects.  7.3.4  and  7.3.5.  While  a  range  of  chemical 
probes  for  DNA  structure  are  available  [120],  mass  spectrometry  has  not  been  a 
prominent  player  in  this  field  until  recently.  This  is  beginning  to  change,  with  the 
realization  of  the  enormous  potential  of  this  technique  as  a  tool  to  provide  rapid  and 
sensitive  characterization  of  the  reaction  products  of  both  cross-linking  [121]  and 
chemical  labeling  [122]. 


248 


LA.  Kaltashov  and  C.E.  Bobst 


7.3.6.2    Higher  Order  Structure  and  Dynamics  of  RNA 

Unlike  DNA,  RNA  molecules  are  known  to  form  a  rich  variety  of  secondary  and 
tertiary  structures  that  make  them  extremely  versatile,  but  the  biophysical  tools  for 
the  study  of  RNA  structure  are  still  somewhat  less  mature  than  those  for  studies  of 
proteins.  Among  other  things,  HDX  measurements  have  been  employed  to  investi- 
gate structure  in  RNA  using  NMR  [123,  124]  and  Raman  spectroscopy  [125]. 
Although  the  glycosidic  hydrogen  atoms  exchange  rapidly,  it  is  possible  to  measure 
protection  of  the  base  amino  and  imino  protons  that  are  involved  in  structure,  which 
provides  information  about  base-pairing  as  opposed  to  bases  that  are  involved  in 
single  stranded  regions  and/or  bulges.  While  these  exchange  reactions  are  still  too 
fast  to  be  followed  in  solution  by  MS,  hydrogen/deuterium  exchange  can  be  carried 
out  in  the  gas  phase,  a  method  that  shows  promise  for  determining  structural  ele- 
ments in  oligonucleotides  [126,  127]. 

Hydroxyl  radical  modification  has  been  very  successful  as  a  means  of  probing 
oligonucleotide  structure  in  solution,  although  other  chemical  modifications  can  be 
employed  to  investigate  RNA  structure  as  well.  A  variety  of  reagents  are  available 
that  act  as  solvent  accessibility  probes,  since  they  are  unable  to  modify  nucleotides 
involved  in  base-pairing,  stacking,  or  other  tertiary  interactions.  A  similar  approach 
can  be  used  to  probe  RNA  structure  and  RNA-protein  interactions  [128,  129], 
where  the  extent  of  chemical  labeling  is  monitored  by  MS,  and  subsequent  diges- 
tion with  ribonuclease  and  analysis  of  the  resulting  fragments  by  high  resolution 
MS  allows  the  modification  sites  to  be  localized.  In  addition  to  solvent  accessibility 
information,  chemical  labeling  can  also  provide  a  measure  of  structural  flexibility  of 
RNA  molecules  [130].  Recently,  a  technique  dubbed  MS3D  [92]  was  introduced  to 
probe  higher  order  structure  of  RNA,  the  workflow  for  which  is  shown  in  Fig.  7.17 
[131].  Essentially,  the  structure  of  the  polynucleotide  under  native  conditions  is 
probed  by  a  series  of  chemical  footprinting  reagents.  These  solvent  accessibility 
probes  have  varying  specificity  for  different  bases,  and  their  reactivity  is  limited  by 
the  presence  of  base-pairing,  stacking,  or  other  tertiary  interactions.  Following 
labeling,  the  sites  of  modification  are  determined  by  a  combination  of  bottom-up 
(digestion  with  ribonucleases)  or  top-down  (gas  phase  fragmentation)  methods. 
Additional  MS/MS  techniques  can  be  used  to  pinpoint  the  labeled  site  to  the  indi- 
vidual nucleotide. 


7.4    Current  Challenges  and  Future  Directions 

Mass  spectrometry  has  truly  become  a  routine  analytical  tool  in  diverse  fields  of 
molecular  biophysics  and  structural  biology,  although  many  areas  remain  where  it 
still  faces  significant  challenges.  For  examples,  several  classes  of  proteins  are  noto- 
riously difficult  to  analyze  using  MS -based  approaches,  and  chief  among  them  are 
membrane  proteins.  The  strongly  hydrophobic  or  amphipathic  character  of  mem- 
brane proteins  results  in  their  general  insolubility,  which  makes  any  experimental 
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Fig.  7.17  General  workflow 
for  3D- structure 
determination  of  nucleic 
acids  based  on  structural 
probing  and  MS  analysis 
(MS3D).  The  substrate  is 
probed  under  ideal  conditions 
preserving  its  native  fold. 
Characterization  of  the 
ensuing  covalent  adducts  can 
be  performed  under 
denaturing  conditions, 
following  either  bottom-up  or 
top-down  approaches.  The 
positions  of  probed 
nucleotides  provide  spatial 
constraints  that  are 
summarized  on  2D  maps, 
from  which  a  complete, 
all- atom  3D  structure  can  be 
readily  generated  through 
established  molecular 
modeling  protocols. 
Reproduced  with  permission 
from  [134] 
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study  of  these  proteins  an  extremely  difficult  undertaking.  Mass  spectrometry  is  not 
an  exception,  since  even  sequencing  of  membrane  proteins  is  often  problematic  due 
to  their  extreme  instability  in  solutions  that  are  commonly  used  in  MS  work.  Another 
obstacle  to  MS  analysis  is  presented  by  protein  aggregation,  a  process  that  is  now  at 
the  cross-hair  of  biophysical  research  due  to  its  obvious  importance  in  the  etiology 
of  the  so-called  conformational  diseases  (such  as  Alzheimer's  and  Parkinson's),  as 
well  as  its  importance  in  the  burgeoning  biotechnology  and  biopharmaceutical  sec- 
tors. Finally,  mass  spectrometry  increasingly  finds  itself  in  the  midst  of  the  on-going 
paradigm  shift  affecting  the  entire  field  of  biophysics  and  structural  biology,  namely 
breaking  away  from  the  reductionist  description  of  various  biophysical  and  bio- 
chemical phenomena,  and  embracing  the  enormous  complexity  of  living  systems. 
While  MS  in  general  played  a  very  visible  role  in  catalyzing  this  shift  (particularly 
in  the  fields  of  proteomics  and  interactomics),  many  more  traditional  MS-based 
approaches  to  study  architecture  and  dynamics  of  biological  molecules  were  slow  to 
respond.  Clearly,  biological  MS  is  and  will  continue  to  be  a  very  dynamic  area  of 
research,  which  will  certainly  continue  to  evolve  and  make  important  contributions 
to  the  Life  Sciences  in  general,  and  advance  the  fields  of  biophysics  and  structural 
biology  in  particular. 
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Chapter  8 

Single-Molecule  Methods 


Paul  J.  Bujalowski,  Michael  Sherman, 
and  Andres  F.  Oberhauser 


Abstract  Single-molecule  methods  have  emerged  as  powerful  tools  in  life  science 
research.  These  techniques  allow  the  detection  and  manipulation  of  individual  bio- 
logical molecules  and  investigate,  with  unprecedented  resolution,  their  conforma- 
tions and  dynamics  at  the  nanoscale  level.  These  techniques  overcome  the  restrictions 
of  traditional  bulk  biochemical  studies  by  focusing  on  individuals  of  molecules. 
Here  we  describe  some  of  the  most  common  single-molecule  methods  including 
atomic  force  microscopy,  optical  tweezers,  and  fluorescence  microscopy.  We  also 
describe  the  use  of  cryo-electron  microscopy  methods  to  study  large  molecules  and 
macromolecular  assemblies.  We  outline  the  principles  of  operation  for  each  tech- 
nique and  discuss  prominent  applications. 
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8.1  Introduction 


Single-molecule  measurement  techniques  provide  fundamental  information  on  the 
structure  and  function  of  biomolecules  and  are  becoming  an  indispensable  tool  to 
understand  how  biomolecules  work.  During  the  last  2  decades  this  field  has  grown  at 
an  almost  exponential  rate  in  terms  of  biological  and  biophysical  applications.  Single- 
molecule  techniques  have  opened  a  new  field  of  science  that  is  at  the  crossroads  of 
several  disciplines,  namely  biology,  physics,  chemistry,  material  science,  and  com- 
puter science.  Since  the  development  of  single-ion  channel  recording  techniques  in 
the  1970s  [109],  the  family  of  single-molecule  methods  have  expanded  significantly 
to  include,  among  others,  optical  and  magnetic  tweezers,  atomic  force  microscopy 
(AFM),  and  single-molecule  fluorescence.  There  are  two  main  types  of  single-mole- 
cule methods:  (1)  those  that  do  not  use  an  external  force  such  as  single-molecule  fluo- 
rescence microscopy  or  electron  microscopy  and  (2)  those  imposing  an  external  force 
to  the  system  through  an  electric  field  (e.g.,  patch-clamp)  or  a  mechanical  manipula- 
tion (e.g.,  through  tension  or  torsion).  The  latter  subtype,  the  so-called  single-mole- 
cule manipulation  techniques,  offers  a  unique  opportunity  to  study  the  behavior  of 
molecules  under  an  external  mechanical  force,  applied  either  directly  using  flexible 
beams  (e.g.,  AFM,  microneedles)  or  through  external-field  manipulators  (e.g.,  optical 
and  magnetic  tweezers).  Single-molecule  methods  span  5 -orders  of  magnitude  in 
terms  of  forces,  distances,  and  dynamical  ranges.  A  summary  of  the  features  of  vari- 
ous methods  is  provided  in  Table  8.1  and  discussed  in  greater  detail  in  the  different 
sections  of  this  chapter.  Single-molecule  methods  overcome  the  restrictions  of  tradi- 
tional bulk  biochemical  studies  by  focusing  not  on  a  population  of  molecules  but  on 
the  molecule  itself.  These  methods  are  often  the  approach  of  choice  to  clarify  and 
better  understand  the  functions  of  molecular  motors,  transcription,  replication,  trans- 
lation, protein  folding,  or  the  structure  of  membrane  proteins.  In  this  chapter  we  focus 
on  the  most  commonly  used  single-molecule  methods,  namely  AFM,  optical  twee- 
zers, fluorescence  microscopy,  and  single-particle  imaging  using  electron  microscopy. 
We  describe  their  operating  principles,  practical  implementation,  and  for  each  method 
we  discuss  a  few  noteworthy  examples. 


8.2    Range  of  Forces  at  the  Single-Molecule  Level 


Biomolecules  are  subject  to  thermal  forces,  which  are  random  in  nature.  When 
these  forces  act  on  small  objects  like  protein  nanomachines  in  solution,  they  result 
in  what  is  called  Brownian  motion.  It  is  through  thermal  energy  that  proteins  reach 
the  high-energy  transition  states  that  are  essential  in  biochemical  reactions.  The 
energies  involved  in  protein  conformational  changes  are  slightly  above  thermal 
energy  levels  (or  thermal  noise),  typically  ranging  from  1  kBT  (thermal  energy; 
kBT=4.1  pN  nm=0.6  kcal/mol,  at  room  temperature,  where  kB  is  the  Boltzmann 
constant  and  T  is  the  absolute  temperature)  to  25  kBT  (the  energy  released  by  ATP 
hydrolysis)  such  that  the  structures  are  stable  enough  to  prevail  at  physiological 
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temperatures.  Given  that  changes  in  protein  conformation  are  measured  in  the 
Angstrom  to  nanometer  range  (A  to  nm,  1  A=  10~10  m,  1  nm=  10~9  m)  the  relevant 
biological  forces  are  expected  to  be  in  the  piconewton  range  (1  pN=10-12  N). 
Because  proteins  are  subject  to  thermal  forces,  the  number  of  possible  conforma- 
tions is  at  its  maximum  when  a  protein  forms  a  random  coil  or  is  denatured. 
Conformational  entropy  becomes  progressively  reduced  with  the  formation  of  sec- 
ondary and  tertiary  structures.  Stretching  random-coiled  proteins  in  the  low-force 
regime  to  overcome  "entropic  forces"  requires  the  application  of  forces  in  the  order 
of  a  few  pN,  which  has  been  achieved  experimentally  using  single-molecule- 
manipulation  techniques.  Several  molecular  motors  such  as  myosin,  kinesin,  and 
RNA  or  DNA  polymerases  also  generate  forces  in  this  range  (see  examples  pre- 
sented below).  The  next  group  consists  of  "enthalpic  forces,"  which  includes  the 
forces  needed  to  unfold  the  folded  domains  of  proteins  (i.e.,  intramolecular  interac- 
tions) as  well  as  those  required  to  overcome  specific  intermolecular  interactions 
such  as  ligand/receptor  or  antigen/antibody.  These  forces  are  typically  in  the 
50-300  pN  range  (at  pulling  speeds  of  1  um/s).  It  must  be  noted  that  protein 
mechanical  unfolding  is  typically  a  nonequilibrium  dynamic  process  and  therefore 
these  forces  depend  on  the  pulling  speed.  The  typical  pulling  speeds  in  vivo  may  in 
some  cases  be  much  lower  and  therefore  the  corresponding  forces  may  also  be 
lower.  The  forces  needed  to  break  covalent  bonds  apart  are  almost  two  orders  of 
magnitude  larger,  in  the  range  of  a  few  nanonewtons  (1  nN=  10~9  N). 


8.3    Atomic  Force  Microscopy  Methods 

The  AFM  was  first  described  in  1986  and  originally  developed  as  a  high  resolution 
imaging  tool  [17]  before  it  began  to  be  used  to  probe  and  manipulate  molecules. 
During  the  last  two  and  half  decades  AFM  has  evolved  as  a  very  powerful  and  versa- 
tile tool  in  biology  that  can  be  used,  for  example,  to  manipulate  and  detect  single 
proteins,  DNA,  or  polysaccarides  [72,  83,  92,  93, 112, 122, 124, 125],  to  image  single 
molecules  in  physiological  conditions  with  nanometer  resolution  [40,  75,  106],  to 
measure  the  interaction  forces  between  proteins  [24,  45,  103],  exocytotic  fusion  [20, 
102],  mapping  of  cell  surface  receptors  [35,  62,  63,  105],  and  high-speed  imaging  of 
molecular  motors  in  action  [75].  One  of  the  key  advantages  of  the  AFM  as  a  single- 
molecule  technique  is  the  straightforward  sample  preparation,  the  ability  to  conduct 
imaging  and  manipulation  experiments  of  biomolecules  under  physiological  relevant 
conditions  and  the  direct  analysis  of  the  dynamics  of  single  molecules  or  complexes. 


8. 3. 1    Basic  Principles 

The  AFM  is  a  remarkably  simple  instrument  that  can  measure  forces  down  to  few 

o 

pico-newtons  and  distances  of  only  few  Angstroms.  The  AFM  consists  of  two 
main  parts:  the  scanner  (XYZ  stage)  and  an  optical  head  (Fig.  8.1a).  The  core  of 
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Fig.  8.1  Schematic  diagram  of  an  AFM.  (a)  The  AFM  is  a  remarkably  simple  instrument  that  can 
measure  forces  down  to  few  pico-newtons  and  distances  of  only  few  angtroms.  The  AFM  consists 
of  two  main  parts:  the  scanner  (XYZ  stage)  and  an  optical  head.  The  heart  of  the  system  is  a  small 
cantilever  that  functions  as  a  microscopic  spring.  When  the  cantilever  is  brought  into  contact  with 
a  sample  it  bends.  This  bending  can  be  easily  detected  by  shining  a  laser  on  the  back  of  the  canti- 
lever; the  light  that  bounces  off  is  captured  by  a  photodetector.  The  optical  amplification  is  such 
that  a  tiny  deformation  of  a  few  nanometers  causes  a  large  in  the  photovoltage  of  the  detector 
which  is  then  converted  into  a  force  signal,  (b)  A  scanning  electron  microscope  image  of  com- 
mercially available  AFM  cantilevers  showing  both  triangular  and  beam-shaped  cantilevers.  A 
human  hair  with  a  diameter  of  ~100  um  diameter  is  included  as  a  size  reference.  The  inset  shows 
a  higher  magnification  image  of  the  end  of  a  cantilever  showing  the  tip  that  has  a  radius  of  curva- 
ture of  ~10  nm  (obtained  with  permission  from  Allison  et  al.  [3]) 

the  system  is  a  small  cantilever,  a  thin  and  flexible  piece  of  silicon  (about  200  um 
in  length  and  10  um  in  thickness)  that  works  as  a  microscopic  force  sensor 
(Fig.  8.1b).  At  the  very  tip  of  the  cantilever  there  is  small  stylus,  which  looks  like 
a  pyramid  that  is  very  sharp  (in  fact  it  may  be  atomically  sharp,  Fig.  8.1b).  When 
a  cantilever  is  brought  into  contact  with  a  sample,  by  means  of  a  three-dimen- 
sional nano-positioner,  it  bends.  In  order  to  track  this  bending  a  laser  beam  is 
shined  on  the  back  of  the  cantilever;  the  laser  light  that  bounces  off  is  captured  by 
a  position-sensitive  photodetector  that  tracks  the  position  of  the  laser  spot 
(Fig.  8.1a).  The  optical  amplification  is  such  that  a  tiny  deformation  of  the  canti- 
lever, of  only  a  few  nanometers,  causes  a  large  change  in  the  photovoltage  of  the 
detector.  In  order  to  measure  force  one  simply  uses  the  relationship  that  relates  the 
force  that  a  spring  develops  when  is  stretched.  The  proportionality  factor  is  the 
stiffness  of  the  cantilever  (or  spring  constant,  kc)  which  is  calculated  from  Hooke's 
law,  F=kcAx,  where  Ax  represents  the  cantilever  deflection.  A  typical  spring 
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o 

constant  is  about  100  pN/nm,  so  a  bending  of  only  10  A  (the  size  of  three  water 
molecules)  causes  a  change  in  force  of  100  pN  (the  noise  level  of  the  system  is 

o 

about  1  A).  For  normal  topographic  imaging,  the  probe  tip  is  brought  into  continu- 
ous or  intermittent  contact  with  the  sample  and  raster- scanned  over  the  surface. 
Several  AFM  imaging  modes  can  be  used  which  vary  mainly  in  the  way  the  tip  of 
the  cantilever  is  moved  over  the  sample.  The  most  popular  is  the  contact  mode  in 
which  topographical  information  is  obtained  by  two  methods:  (1)  by  measuring 
the  cantilever  deflection  while  the  sample  is  scanned  at  constant  height  or  (2)  by 
measuring  cantilever  displacement  while  maintaining  the  cantilever  deflection 
constant  using  a  feedback  loop.  In  the  intermittent  mode  the  cantilever  is  oscil- 
lated at  its  resonant  frequency  and  the  amplitude  and  phase  are  recorded  while 
scanning  the  sample. 

8.3.2    Imaging  Single  Biomolecules  Using  AFM 

One  of  the  most  stunning  applications  of  the  AFM  is  the  imaging  of  single  mem- 
brane proteins  at  subnanometer  resolution  under  physiologically  relevant  condi- 
tions [39,  46].  The  three-dimensional  structure  of  gap  junction  channels  obtained 

o 

using  cryo-electron  microscopy  at  a  resolution  of  7.5  A  and  X-ray  crystallography 
at  3.5  A  resolution  provided  direct  evidence  for  alpha-helical  folding  of  four  trans- 
membrane domains  within  each  connexin  subunit  [90,  145].  AFM  provided  com- 
plementary information  on  the  structural  features  of  the  extracellular  surface  of 
single  gap  junctions  obtained  physiological  buffer  conditions  [65].  Later  work 
demonstrated  that  it  is  possible  to  track  conformational  changes  at  subnanometer 
resolutions  using  AFM  [104].  Figure  8.2a  shows  an  example  of  a  topographic 
image  of  the  extracellular  surface  of  split  native  connexin  26  gap  junction  plaque. 
The  six  individual  subunits  of  the  connexin  are  distinctly  visible.  The  subunits  pro- 
trude by  about  1.5  nm  above  the  lipid  bilayer  and  are  arranged  into  a  donut- shaped 
structure  surrounding  a  central  pore.  Connexin  26  gap  junctions  participate  in  cell- 
cell  communication  and  respond  to  changes  in  the  calcium  concentration. 
Remarkably  it  was  found  that  upon  injection  of  calcium  into  the  buffer  solution,  the 
extracellular  channel  entrance  reduced  its  diameter  from  1.5  to  0.6  nm,  a  conforma- 
tional change  that  was  fully  reversible. 

AFM  allows  direct  nano-imaging  of  DNA-protein  complexes  at  the  single- 
molecule  level  [59,  88,  89].  In  these  experiments  the  protein  is  readily  identified 
as  a  "blob"  on  the  DNA.  The  change  in  length  of  the  DNA  upon  binding  the 
protein  gives  an  indication  of  the  extent  to  which  the  DNA  is  looped  within  or 
wrapped  around  the  protein  [56,  59].  For  example,  AFM  was  recently  applied  to 
the  important  problem  of  DNA  mismatch  repair  [70].  Several  proteins  participate 
in  detecting  mismatch  and  directing  repair.  In  bacteria  MutS  initiates  the  mis- 
match repair  process.  Figure  8.2b  shows  AFM  images  of  DNA: MutS  complexes 
containing  a  mismatch  in  the  middle  of  the  DNA  molecule.  The  volume  measure- 
ments suggested  that  each  "blob"  in  these  images  correspond  to  MutS  tetramers. 
This  demonstrates  the  power  of  AFM  methods  in  resolving  individual  nucleopro- 
tein  complexes  in  liquid. 
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Fig.  8.2  AFM:  a  nano-toolbox  for  single  molecule  detection  and  manipulation,  (a)  Tracking  confor- 
mational changes  in  surface  structures  of  isolated  Connexin26  gap  junctions.  AFM  topograph  of 
individual  connexons.  Inset:  Average  of  the  raw  data  exhibiting  a  lateral  resolution  of  -1.2  nm.  The 
six  individual  subunits  of  the  connexin  are  distinctly  visible.  The  subunits  protrude  by  about  1 .5  nm 
above  the  lipid  bilayer  and  are  arranged  into  a  donut-shaped  structure  surrounding  a  central  pore. 
Reproduced  with  permission  from  Muller  et  al.  [104].  (b)  Imaging  DNA-protein  complexes  at  the 
single  molecule  level.  AFM  images  of  DNA:MutS  complexes  containing  a  mismatch  in  the  middle 
of  the  DNA  molecule.  In  these  experiments  the  MutS  protein  is  readily  identified  as  a  "blob"  on  the 
DNA.  Each  image  is  250x250  nm.  Reproduced  with  permission  from  Jiang  and  Marszalek  [70].  (c) 
The  AFM  can  be  also  used  to  analyze  the  unfolding  and  refolding  pathways  of  single  proteins.  This 
cartoon  diagram  depicts  a  multidomain  protein  (e.g.,  titin)  being  stretched  by  the  AFM  tip.  As  the 
protein  is  stretched  (Ax)  the  force  raises  until  one  of  the  domains  unfolds  resulting  in  a  sudden 
decrease  in  the  force,  (d)  Measuring  ligand  (biotin)  and  receptor  (avidin)  interactions  using  AFM. 
Schematic  representation  of  the  interaction  between  an  AFM  tip,  functionalized  with  avidin  mole- 
cules, and  a  biotin-derivatized  agarose  bead  (not  drawn  to  scale).  The  biotin  molecules  are  covalently 
coupled  to  the  bead.  During  the  withdrawal  of  the  AFM  tip  the  tension  across  the  avidin-biotin 
complex  to  increase  gradually.  Reproduced  with  permission  from  Florin  et  al.  [44] 


8.3.3    Single-Molecule  Force  Spectroscopy  of  Biomolecules 

The  so-called  single-molecule  force  spectroscopy  (SMFS)  was  designed  to  record 
force-extension  curves  obtained  by  pulling  in  a  single  direction  (z  axis;  Fig.  8.2c)  [28, 
31,  122,  124].  Two  basic  SMFS  modes  are  currently  used  depending  on  the  variable 
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being  controlled:  the  more  common  length-clamp,  which  yields  a  force-extension 
curve,  and  the  force-clamp,  which  yields  an  extension-time  curve  (Fig.  8.3b,  top 
panel).  SMFS  is  a  very  sensitive  technique  that  can  measure  forces  of  tens  of  picone- 
wtons  and  changes  in  length  with  nanometer  resolution.  However,  a  common  prob- 
lem is  that  force  peaks  can  originate  from  a  variety  of  sources  other  than  the  interaction 
of  interest  (detachment  of  other  molecules  from  any  of  the  two  anchoring  points, 
protein-protein  interactions,  disentanglement  of  molecules,  etc.)  or  from  multiple 
molecules  in  parallel.  This  drawback  was  overcome  by  using  long  multidomain  pro- 
teins (such  as  titin,  tenascin,  or  spectrin)  [82,  114,  124]  or  homo-oligomers  recombi- 
nant proteins  [27],  in  which  their  periodicity  was  used  to  infer  single  molecules 
unequivocally.  The  protein  molecules  are  first  immobilized  between  the  substrate 
(a  glass  coverslip)  and  the  tip  of  the  cantilever.  Typically,  in  these  experiments  pro- 
teins get  attached  by  physisorption  (i.e.,  nonspecific  adsorption)  although  sometimes 
specific  functionalization  methods  are  also  used  (e.g.,  terminal  cysteine  residues  in 
the  protein  that  get  covalently  linked  to  the  gold-coated  surface  of  the  substrate  or/ 
and  the  cantilever  tip).  Protein  molecules  are  then  stretched,  by  moving  apart  the 
AFM  piezoelectric  positioner,  which  applies  a  stretching  force  that  unfolds  the  pro- 
tein. The  resulting  force  versus  extension  curve  can  be  simply  analyzed  using  the 
worm-like  chain  model  for  polymer  elasticity  which  describes  how  a  polypeptide 
chain  behaves  under  a  stretching  force  [23].  In  the  case  of  modular  proteins  such  as 
titin  the  resulting  force-extension  curve  displays  the  characteristic  sawtooth  pattern 
that  results  from  the  unfolding  of  individual  immunoglobulin  domains  [124] 
(Fig.  8.3b).  The  typical  range  of  the  forces  required  to  unfold  single  proteins  is 
between  50  and  500  pN  (at  pulling  speeds  of  about  1  um/s)  [112].  By  retracting  the 
AFM  positioner,  the  protein  can  also  be  refolded  in  the  presence  or  in  absence  of 
mechanical  force.  In  the  force-clamp  mode  (Fig.  8.3b,  bottom  panel),  a  feedback 
mechanism  quickly  corrects  the  distance  between  the  coverslip  and  the  AFM  tip  in 
order  to  control  the  applied  force.  After  the  application  of  a  stretching  force  to  a  mul- 
tidomain protein,  such  as  titin,  it  unfolds  in  a  staircase  pattern  where  each  step  cor- 
responds to  the  all-or-none  unfolding  of  individual  domains.  The  main  advantage  of 
this  mode  is  the  precise  control  of  the  end-to-end  distance  of  the  protein  with  sub- 
nanometer  resolution  in  the  millisecond  time  scale.  Force-clamp  SMFS  techniques 
are  currently  being  used  to  tackle  fundamental  problems  in  biology  such  as  protein 
folding  [2,  42,  115,  160]  and  chemical  mechanisms  in  enzyme  catalysis  [15,  120, 
141].  SMFS  methods  are  constantly  providing  with  exciting  and  promising  new  ways 
to  study  the  molecular  mechanisms  of  protein  folding.  For  example  SMFS  allow  the 
direct  measurement  of  the  main  energy  barriers  in  the  unfolding  and  folding  path- 
ways and  the  location  of  these  barriers  with  single  amino  acid  resolution  [26,  151]. 

The  AFM  has  been  also  successfully  used  to  analyze  the  magnitude  of  the  inter- 
action forces  between  single  ligand-receptor  pairs  (Fig.  8. 2d).  In  this  method  the  tip 
of  the  cantilever,  functionalized  with  one  molecule,  is  brought  into  contact  with  a 
surface  that  is  covered  with  the  other  molecule  allow  them  to  interact  for  a  short 
time  (milliseconds  to  seconds).  Upon  retraction  of  the  cantilever  tip  from  the  sur- 
face molecular  bonds  are  broken  and  the  adhesion  forces  between  the  two  molecules 
are  quantified.  For  example,  the  interaction  forces  between  avidin  and  its  ligand, 
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Fig.  8.3  (a)  Schematic  diagram  of  the  AFM  apparatus  and  associated  control  electronics.  When 
pressed  against  a  layer  of  protein  attached  to  a  substrate,  the  silicon  nitride  tip  of  the  AFM  cantile- 
ver may  adsorb  a  single  protein  molecule.  Extension  of  the  molecule  by  retraction  of  the  piezoelec- 
tric positioner  results  in  deflection  of  the  cantilever.  A  single  molecule  is  stretched  by  using  either 
a  length-clamp  mode  or  a  force-clamp  mode.  In  the  standard  length-clamp  mode  the  desired  posi- 
tion (L)  is  set  and  then  measure  the  resulting  force  (F)  calculated  from  the  laser  deflection,  (a  -  b)l 
(a  +  b),  where  a  and  b  correspond  to  the  photovoltage  in  the  position-sensitive  detector.  In  the 
force-clamp  mode,  the  measured  force  is  compared  with  a  set  value  generating  an  error  signal  that 
is  fed  to  a  proportional,  integral,  and  differential  amplifier  (PID)  whose  output  is  connected  directly 
to  the  piezoelectric  positioner.  Reproduced  with  permission  from  Oberhauser  et  al.  [113].  (b) 
Example  of  stretching  a  multidomain  protein  using  the  length-clamp  (top  panel)  and  force-clamp 
(bottom  panel)  modes  of  the  AFM.  The  AFM  tip  picks  up  a  single  protein  (1)  and  starts  pulling  on 
it  (2).  When  sufficient  force  is  applied  (around  200  pN)  the  domains  begin  to  unfold  (3).  Further 
extension  of  the  protein  unravels  it  in  a  typically  all-or-none  fashion  (3).  In  the  length-clamp  mode 
this  is  seen  as  a  "sawtooth"  pattern  where  each  peak  corresponds  to  the  unfolding  of  individual 
domains.  In  the  force-clamp  mode  the  unfolding  is  seen  as  a  "staircase"  pattern  where  each  step 
corresponds  to  individual  unfolding  events 
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biotin,  has  been  comprehensively  studied  using  SMFS  [44,  69,  103].  Avidin  is  a 
tetrameric  protein  that  binds  four  molecules  of  biotin  with  particularly  strong  affin- 
ity (KD~  10-15  M).  In  these  experiments  both  the  cantilever  tip  and  a  glass  surface 
are  functionalized  with  biotin;  then  avidin  is  added  in  order  to  block  most  of  the 
biotin  molecules.  Under  these  conditions  it  is  possible  to  pull  a  single  biotin  out  its 
avidin  binding  site  revealing  the  force  needed  to  rupture  ligand-receptor  bonds.  This 
was  found  to  be  around  150  pN  which  is  equivalent  to  an  unbinding  free  energy  of 
about  23  kcal/mol  making  it  one  of  the  strongest  known  non-covalent  bonds  [68]. 


8.4    Optical  Tweezers  Methods 

Optical  tweezers,  also  known  as  optical  traps,  were  first  described  in  the  late  1980s  by 
Ashkin  and  colleagues.  They  based  on  the  fact  that  small  dielectric  particles  [6],  includ- 
ing viruses  and  bacteria  [5,  7],  can  be  trapped  by  using  a  focused  laser.  Since  then  the 
field  of  single-molecule  manipulations  with  optical  tweezers  has  grown  at  an  impressive 
pace  and  proven  to  be  an  important  single-molecule  method  in  a  wide  range  of  research 
fields  [16,  41,  52,  76,  97,  101,  137].  For  example,  optical  tweezers  have  been  used  to 
investigate  molecular  motors  such  as  myosin  [4,  43,  126,  137]  and  kinesin  [129,  139, 
140],  processive  enzymes  such  as  DNA  [152]  and  RNA  polymerases  [33,  81,  149]  and 
DNA  translocases  [121],  endonucleases  [50]  and  helicases  [30,  36],  the  bacteriophage 
packaging  motor  [135, 155],  the  unfolding  and  refolding  of  single  proteins  [29,  90, 138, 
143]  and  RNA  hairpins  [85],  the  mechanism  of  action  of  molecular  chaperones  [13], 
and  protein  translocation  by  ATP-dependent  proteases  [8,  91]. 


8. 4. 1    Basic  Principles 

Optical  tweezers  take  advantage  of  gradient  force  produced  by  a  focused  laser  beam. 
The  force  exerted  by  an  optical  field  on  a  small  dielectric  object  (such  as  a  plastic  bead) 
falls  within  the  range  of  0.1-100  pN  and  is  used  to  "trap"  and  manipulate  it  with  sub- 
nanometer  precision,  allowing  the  simultaneous  determination  of  force  and  displace- 
ment [41,  101,  110].  Optical  tweezers  use  a  microscope  objective  to  create  a  focused 
Gaussian  beam  that  exerts  a  force  in  the  direction  of  the  field  gradient  which  draws  it 
towards  the  center  of  the  laser  beam  (Fig.  8.4a).  When  trapped,  the  bead  behaves  as  a 
small  Hookean  spring,  with  the  force  given  by  F=kAx,  where  k  is  the  spring  constant  of 
the  trap  and  Ax  is  the  displacement  of  the  bead  from  the  focus  of  the  trap  (Fig.  8.4b,  right 
inset).  A  restoring  or  trapping  force  arises  whenever  the  bead  is  displaced  from  its  equi- 
librium position.  Photons  carry  momentum  that  is  proportional  to  the  energy  of  the  laser 
beam  and  the  direction  of  propagation.  The  photons  interacting  with  the  bead  will 
undergo  refraction  and  as  a  result  they  will  change  their  momentum  (Fig.  8.4b,  left 
inset).  Because  of  the  conservation  of  momentum,  the  bead  must  experience  also  a  rate 
of  change  of  momentum  of  equal  but  opposite  in  magnitude  that  tends  to  restore  the 
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Fig.  8.4  Basic  principles  of  operation  of  optical  tweezers,  (a)  Basic  elements  in  an  optical  twee- 
zers instrument.  A  laser  beam  (typically  infrared,  IR)  is  focused  on  a  small  dielectric  bead  (inside 
a  fluid  chamber)  by  means  of  a  microscope  objective,  and  the  exiting  light  is  collected  by  a 
position-sensitive  photodetector  to  measure  both  the  intensity  (in  the  range  of  0.1-100  pN)  and  its 
deflection,  (b)  Optical  tweezers  can  be  built  by  focusing  a  laser  beam  through  a  lens  to  form  a 
"trap."  When  trapped,  the  bead  behaves  as  a  small  Hookean  spring.  A  trapping  force  arises  when- 
ever the  bead  is  displaced  from  its  equilibrium  position.  Photons  carry  momentum  that  is  propor- 
tional to  the  energy  of  the  laser  beam  and  the  direction  of  propagation.  The  photons  interacting 
with  the  bead  will  undergo  refraction  and  as  a  result  they  will  change  their  momentum;  hence  the 
bead  will  experience  a  rate  of  change  of  momentum  of  equal  but  opposite  in  magnitude  that  tends 
to  restore  the  bead  back  to  the  center  of  the  beam.  Reproduced  with  permission  from  Bustamante 
et  al.  [22].  (c)  Typical  geometries  used  in  optical  tweezers  experiments:  the  tethered  particle  assay 
(top),  the  pipette-tethered  single-trap  assay  (center)  and  the  dual-trap  assay  (bottom) 


bead  back  to  the  center  of  the  beam.  This  gives  rise  to  a  net  force  acting  on  the  bead 
which  can  be  measured  by  means  of  a  position-sensitive  photodetector.  The  spring  con- 
stants of  the  optical  trap  are  typically  1,000  times  smaller  than  those  for  AFM  cantile- 
vers, in  the  range  of  0.01-1  pN/nm,  meaning  that  a  lower  range  of  forces  are  accessible 
(0.1-100  pN;  Table  8.1).  In  order  to  minimize  optically  induced  damage  to  the  sample 
from  the  intense  laser  beam,  wavelengths  in  the  near-infrared  (800-1,100  nm)  are  used 
because  biological  samples  are  relatively  transparent  to  infrared  light. 
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In  single-molecule  optical  tweezers  experiments,  the  molecule  of  interest  is 
tethered  to  the  optically  trapped  bead  with  different  conjugation  chemistries  gener- 
ally based  on  avidin-biotin  or  antibodies-epitope  interactions.  Different  optical 
tweezers  geometries  have  been  designed  according  to  the  type  of  molecule  and  the 
requirements  for  resolution  and  stability  (Fig.  8.4c)  [52,  101].  In  the  tethered  parti- 
cle assay  (Fig.  8.4c,  top)  the  protein  (e.g.,  cytoskeletal  motors,  such  as  kinesin  or 
myosin)  is  attached  onto  the  optically  trapped  bead  and  the  filament  (microtubule  or 
actin)  is  attached  by  physisorption  to  the  surface  of  a  coverslip.  Motions  of  the 
motor  are  monitored  by  the  changes  in  the  relative  position  of  the  trapped  bead.  This 
is  the  simplest  geometry  in  optical  tweezer  instruments  but  it  is  prone  to  mechanical 
noise  originating  from  thermal  drift  of  the  coverslip  and  laser  fluctuations.  In  the 
pipette-tethered  single-trap  assay  (Fig.  8.4c,  center)  [52,  134],  one  end  of  the  mol- 
ecule (e.g.,  a  long  segment  of  DNA  or  RNA)  is  attached  to  the  trapped  bead  and  the 
other  end  is  bound  to  second  a  bead  suctioned  by  a  glass  micropipette.  The  pipette 
can  be  moved  away  from  the  trapped  bead  to  apply  tension  to  the  molecule  and 
changes  in  length  are  monitored  by  tracking  the  motion  of  the  optically  trapped 
bead.  In  the  double-trap  assay  (Fig.  8.4c,  bottom)  a  second  bead  is  held  using  a 
second  optical  trap.  In  this  case  the  molecule  under  study  is  attached  to  one  bead 
(e.g.,  RNA  polymerase)  and  a  DNA  handle  is  tethered  to  a  second  bead  via  avidin- 
biotin  linkages.  The  relative  motions  of  both  beads  are  independently  monitored. 
The  beads  are  held  in  separate  optical  traps  free  of  the  coverslip  surface,  greatly 
improving  the  stability  and  reducing  the  noise  of  the  system.  The  dual-trap  geome- 
try offers  the  simplest  way  to  implement  a  force-feedback  or  force-clamp  mode  of 
operation  in  which  the  position  of  one  trap  relative  to  the  second  trap  is  adjusted  via 
a  feedback  loop  to  maintain  a  constant  force  on  the  beads  at  all  times  [147].  Recent 
technical  advances  in  dual-trap  optical  tweezers  have  pushed  the  spatial  resolution 
down  to  the  angstrom  scale  [1,  21,  100].  With  this  resolution  single  base  pair 
motions  of  the  RNA  polymerase  were  detected  [1]. 


8.4.2    Examples  of  Biological  Applications  of  Optical  Tweezers: 
Motion  of  Single  Molecular  Motors  and  Folding/ 
Unfolding  Reactions  of  Single  Proteins 

Some  of  the  most  important  applications  of  optical  tweezers  were  done  in  the  1990s, 
when  these  were  used  to  measure  the  motions  of  motor  proteins  such  as  kinesin  [140] 
and  myosin  [43]  with  nanometer  resolution.  Motor  proteins  are  enzymes  powered  by 
ATP  and  drive  a  wide  variety  of  subcellular  movements  such  as  organelle  transport,  cell 
and  chromosomal  division,  and  muscle  contraction.  These  pioneering  experiments  dem- 
onstrated that  these  motors  walk  in  nanometer- steps.  In  an  assay  for  kinesin-driven 
motility,  a  bead  carrying  a  single  motor  protein  is  first  trapped  and  then  brought  into 
contact  with  a  microtubule  bound  to  a  glass  coverslip,  as  shown  in  Fig.  8.5a.  The  kinesin 
motor  attaches  to  the  microtubule  filament  and  "walks"  in  steps  along  it  (Ax),  pulling  the 
bead  from  the  trap  and  building  up  a  force  that  resists  further  motion  of  the  motor. 
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Fig.  8.5  Examples  of  biological  applications  of  optical  tweezers,  (a)  Single  kinesin  molecules 
studied  using  optical  tweezers  techniques  [148].  Record  of  the  displacement  of  a  single  kine- 
sin motor  showing  discrete  8  nm  steps  (black  trace)  as  it  walks  along  a  microtubule  (inset,  not 
to  scale).  The  position  of  the  optical  trap  is  under  computer  control  to  maintain  a  fixed  dis- 
tance behind  the  bead,  thereby  imposing  a  load  of  a  few  piconewtons  in  a  direction  that  hin- 
ders movement  (red  trace).  Reproduced  with  permission  from  Visscher  et  al.  [148].  (b) 
Folding  and  unfolding  reactions  studied  by  optical  tweezers  techniques  [138].  Traces  repre- 
senting stretching  and  relaxation  cycles  of  calmodulin  (top  panel).  The  insert  presents  the 
experimental  setup  where  calmodulin  at  its  ends  is  linked  with  ubiquitins  that  are  attached  to 
DNA  handles  that  are  connected  to  functionalized  silica  beads.  (Bottom  panel)  Traces  of  fluc- 
tuation of  a  single  calmodulin  molecule  at  constant  trap  separation  at  5  min  time  intervals.  The 
vertical  scale  represents  force  acting  on  single  molecule.  The  identified  intermediate  states  of 
protein  are  colored.  The  data  at  full  resolution  are  shown  in  grey  and  low  pass  filtered  data  are 
shown  in  black.  Reproduced  with  permission  from  Stigler  et  al.  [138] 


Optical  traps  can  also  exert  forces  while  single  proteins  undergo  structural 
changes  such  as  unfolding  and  folding  reactions.  Optical  tweezers  have  been  used 
to  pull  the  ends  of  a  single  folded  protein  molecules  until  it  straightened  out  and  then 
to  reduce  the  tension  to  allow  it  to  fold  again  [29,  90,  131,  138,  143].  Recent  experi- 
ments on  calmodulin  showed  that  optical  tweezers  allow  the  dissection  of  protein 
folding  pathways  [138]  (Fig.  8.5b).  The  calmodulin  protein  is  stretched  to  a  preset 
force  value  where  complex  fluctuations  of  full-length  protein  that  occurs  between 
folded  and  unfolded  states  of  protein  can  be  observed  and  analyzed.  The  equilibrium 
fluctuation  data  reveals  number  of  intermediate  states  and  allows  reconstruction  of 
possible  transition  pathways  between  the  intermediates,  thus  these  experiments  can 
address  the  role  of  intermediates  and  the  mechanism  of  protein  folding. 
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8.5    Single-Molecule  Fluorescence  Methods 

Fluorescence  methods  have  been  used  for  the  detection  of  single  molecules  for 
several  decades.  One  of  the  first  papers  describing  the  visualization  of  a  single  pro- 
tein by  fluorescence  methods  was  in  the  1970s  by  Hirschfeld  [64],  who  showed  the 
optical  microscopic  observation  of  single  antibody  molecules  labeled  with  -100 
fluorophores.  Since  then  the  single-molecule  fluorescence  field  has  advances  at  a 
remarkable  pace.  For  example  nowadays  it  is  possible  to  track  the  rotational  motion 
of  single  proteins  on  living  cells  with  millisecond  resolution  [144].  One  of  the  main 
advantages  of  single-molecule  fluorescence  techniques  over  AFM  or  optical  traps  is 
that  they  are  largely  noninvasive  and  typically  require  less  complex  instrumentation 
and  hence  are  fairly  accessible  to  cell  biologists,  biochemists,  and  biophysicists. 


8. 5. 1    Basic  Principles 

The  basis  of  fluorescence  is  the  relaxation  of  a  fluorophore  from  an  excited  electron 
state  (high  energy)  to  the  ground  state  (low  energy)  accompanied  by  the  emission  of 
radiation  [78].  The  emission  rates  are  in  the  order  of  108  s-1  meaning  that  typical 
fluorescence  lifetimes  are  in  the  nanosecond  range.  Fluorescence  typically  occurs 
from  aromatic  molecules;  in  single-molecule  experiments  the  favorite  fluorophores 
are  those  based  on  cyanine  (e.g.,  Cy3  and  Cy5)  and  rhodamine  (e.g.,  Texas  red  and 
Alexa  dyes)  [71,  78].  These  dyes  can  be  incorporated  into  the  molecule  of  interest 
via  chemical  coupling  to  free  sulfhydryl  or  amino  groups.  The  quantum  yield,  fluo- 
rescence lifetime,  and  photostability  are  the  most  important  properties  of  fluoro- 
phores used  in  single-molecule  experiments  [78].  Quantum  yield  refers  to  the 
number  of  emitted  photons  relative  to  the  absorbed  photons;  lifetime  determines  the 
time  available  for  the  fluorophore  to  interact  in  its  environment.  The  emission  wave- 
length, quantum  efficiency,  and  lifetime  are  highly  dependent  on  the  local  physical 
and  chemical  microenvironment  and  hence  the  fluorophore  relaxation  process  can 
be  used  to  obtain  structural  and  dynamical  information  at  the  single-molecule  level. 

Almost  all  fluorophores  are  photobleached  upon  continuous  illumination. 
Photobleaching  is  an  irreversible  process  that  results  in  the  loss  of  fluorescence.  In  typi- 
cal single-molecule  fluorescence  experiments  a  single  fluorophore  emits  about  106  pho- 
tons before  undergoing  photobleaching  [98].  The  Alexa  and  Cy3,  Cy5  dyes  are 
commonly  used  in  single-molecule  experiments  since  several  show  high  quantum  yield 
and  reasonable  photostability  and  appear  to  have  been  developed  for  this  reason  [71]. 

The  most  common  experimental  optical  geometries  used  in  single-molecule  flu- 
orescence are  confocal  (Fig.  8.6a)  [111]  and  total  internal  reflection  fluorescence 
microscopy  (TIRF,  Fig.  8.6c)  [9-11,  71,  73,  118].  In  confocal  microscopy  a  laser 
beam  is  focused  by  a  high-numerical  aperture  objective  lens  on  the  sample  [150], 
resulting  in  the  excitation  of  a  very  small  volume  (on  the  order  of  femtoliters 
(10~15  L)).  The  emitted  fluorescence  is  collected  by  the  objective  and  projected,  via 
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Fig.  8.6  Typical  experimental  setups  used  in  single-molecule  fluorescence.  Single-molecule  fluo- 
rescence methods  rely  on  the  detection  of  photons  from  a  small  diffraction  limited  spot  (confocal, 
a)  or  from  a  large  area  (wide-field,  c).  (a)  In  confocal  microscopy  the  fluorescence  emitted  from  a 
small  volume  (femtoliters)  is  detected  using  a  detector,  such  avalanche  photodiodes,  making  it 
possible  to  achieve  a  time  resolution  in  the  order  of  picoseconds,  (b)  Diagram  depicting  the 
Gaussian  distribution  of  a  single  fluorophore.  By  tracking  the  median  of  the  distribution  (Ax)  it  is 
possible  to  follow  the  trajectories  of  single  molecules,  (c)  In  total  internal  reflection  fluorescence 
(TIRF)  microscopy  the  evanescent  wave  created  (-100  nm  in  depth)  is  used  to  excite  fluorescent 
molecules  near  the  coverslip.  The  use  of  two  fluorophores  allows  the  tracking  of  conformational 
changes  of  a  single  biomolecule  (e.g.,  labeled  with  green  and  red  fluorophores)  using  FRET.  In 
order  to  stably  and  specifically  immobilize  biomolecules  to  a  glass  coverslip,  a  biotin-streptavidin 
linkage  is  frequently  used,  (d)  Diagram  showing  the  dependence  of  the  energy  transfer  efficiency 
(E)  on  the  distance  R.  R0  is  the  Froster  distance 

a  pinhole  positioned  in  the  conjugate  focal  plane  in  order  to  avoid  the  collection  of 
photons  generated  at  locations  different  from  the  focal  plane.  The  light  is  collected 
by  a  very  fast  detector,  such  as  avalanche  photodiodes,  making  it  possible  to  achieve 
a  time  resolution  on  the  order  of  picoseconds.  In  TIRF  microscopy  an  evanescent 
wave  is  formed  by  the  refraction  of  a  laser  beam  through  a  high-numerical  aperture 
objective  lens  or  a  prism  [9-11,  118].  This  laser  beam  is  focused  at  the  boundary 
between  two  optical  media  having  different  refraction  indices  at  such  an  angle  that 
all  the  light  is  reflected  off  the  glass  surface.  However,  an  evanescent  wave  forms 
which  is  capable  of  exciting  fluorophores  near  the  surface.  The  intensity  of  the  eva- 
nescent field  decays  exponentially  with  the  distance  from  the  surface  allowing  the 
selective  excitation  of  fluorophores  within  only  100  nm  resulting  in  a  very  low 
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background  signal.  The  emitted  light  is  collected  by  2-dimentional  detectors  such  as 
charge-coupled  device  (CCD)  cameras  allowing  the  simultaneous  imaging  of  hun- 
dreds of  single  molecules  with  millisecond  resolution. 

One  of  the  simplest  methods  to  directly  visualize  fluorescently  labeled  molecules 
is  to  track  motion  using  TIRF  microscopy.  The  spatial  resolution  of  conventional 
microscopy  is  limited  by  the  Rayleigh  criterion  [132];  since  the  diffraction  limit  is 
approximately, 

o=A/2NA,  where  X  is  the  emitted  wavelength  and  NA  the  numerical  aperture  of 
the  objective  lens.  For  a  high-numerical  aperture  objective  of  1.65,  two  molecules 
cannot  be  resolved  if  they  are  closer  than  -150  nm.  Several  methods  have  been 
developed  to  push  the  limit  for  single-molecule  fluorescence  localization  to  the 
nanometer  range.  For  example,  centroid-tracking  methods  have  been  recently  used 
to  directly  analyze  the  stepping  mechanism  of  myosin  and  kinesin  molecular  motors 
[126,  153,  154].  In  this  technique  the  intensity  profile  of  the  light  emitted  by  the 
fluorophore  is  fitted  to  a  Gaussian  distribution  function  (Fig.  8.6b).  The  center  of  the 
fluorescence  distribution  (centroid)  can  be  determined  with  an  uncertainty,  Ax,  of 

A  x  =       ,  where  N  is  the  number  of  photons  counted  and  a  is  the  width  of  the 


diffraction  limited  spot  [52].  Hence,  in  this  method  the  resolution  is  not  limited  by 
the  Rayleigh  criterion  but  instead  by  the  number  of  photons  counted  over  short 
periods  of  time.  Tracking  the  motion  of  single  molecules  has  reached  the  remark- 
able precision  of  1  nm  and  with  millisecond  temporal  resolution  [77, 119, 153, 154]. 
This  centroid-tracking  method  was  fittingly  named  "fluorescence  imaging  with  one 
nanometer  accuracy"  (FIONA)  [136,  153,  154]. 

One  of  the  most  commonly  used  single-molecule  fluorescence  methods  is  FRET 
(fluorescent  resonance  energy  transfer)  or  Froster  transfer  [78].  FRET  between  two 
fluorophores  occurs  when  the  excitation  energy  of  the  donor  is  transferred  to  the 
acceptor  via  an  induced  dipole-dipole  interaction.  The  energy  transfer  efficiency  is 
given  by 

E=  1/(1  +  (7?/7?0)6),  where  R  is  the  distance  between  the  donor  and  acceptor  fluo- 
rophores and  R0  is  the  so-called  Forster  radius,  which  is  the  distance  at  which  £"=0.5 
(Fig.  8.6d).  The  efficiency  of  the  transfer  is  a  very  sensitive  function  of  the  distance 
between  the  two  fluorophores  in  the  2-10  nm  range  [34,  57,  58].  Experimentally  E 
is  obtained  by  using: 

£=[l+y/d//J-1,  where  Id  and  Ia  are  the  donor  and  acceptor  intensities  and  y  a 
correction  factor  that  depends  on  the  donor  and  acceptor  quantum  yield  and  detec- 
tion efficiency  [57,  73,  127].  This  means  that  the  FRET  efficiency  depends  not  only 
on  the  interdye  distance  but  it  also  depends  on  the  angles  between  the  respective 
dipoles  [78]  making  it  difficult  to  convert  E  into  quantitative  distances. 

Single-molecule  FRET  is  perhaps  the  most  flexible  and  successful  single- 
molecule  techniques  in  biology.  This  is  because  of  its  inherent  sensitivity  and  the 
steep  dependence  of  the  energy  transfer  to  the  distance  between  the  dyes  (propor- 
tional to  R~6).  FRET  can  effectively  measure  distances  between  2  and  10  nm  making 
it  ideal  to  track  conformational  changes  [127].  Single-molecule  FRET  time 
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trajectories  are  most  commonly  acquired  by  imaging  surface-immobilized  molecules 
with  the  aid  of  TIRF  microscopy  (Fig.  8.6c)  that  allows  high-throughput  data  acqui- 
sition [161].  Single-molecule  FRET  has  been  successfully  used  to  track  the  confor- 
mational changes  of  a  wide  array  of  molecules,  such  as  calcium-binding  proteins 
[142],  four- way  DNA  (Holliday)  junctions  [96],  the  reverse  transcription  initiation 
complex  [18],  the  folding  of  single  ribozyme  [161],  the  folding  of  single  proteins 
[74,  107],  and  the  conformations  of  individual  SNARE  proteins  in  live  cells  [146]. 

A  number  of  "super-resolution"  fluorescence  microscopy  techniques  have  been 
recently  developed  that  achieve  lateral  resolutions  in  the  nanometer  scale  allowing 
the  localization  of  single  molecules  on  fixed  or  living  cells  [49,  55,  66,  74,  86,  133]. 
In  these  techniques  the  imaging  setup  and  fluorophores  are  designed  in  such  a  way 
to  circumvent  the  Rayleigh  criterion  for  resolution.  For  instance  photoactivated 
localization  microscopy  (PALM)  [14],  fluorescence  photoactivation  localization 
microscopy  (FPALM)  [61],  and  stochastic  optical  reconstruction  microscopy 
(STORM)  [128]  can  achieve  extremely  high  resolution  by  localizing  individual 
photoactivable  fluorophores  on  cells  or  tissues.  These  techniques  are  opening  excit- 
ing new  opportunities  for  biologists  to  interrogate  single  cells  at  the  molecular  scale 
through  direct  observation  of  protein  movement. 


8.5.2    Examples  of  Biological  Applications  of  Single-Molecule 
Fluorescence  Methods 

FIONA  methods  have  been  widely  applied  by  several  groups  to  track  the  motion  of 
cytoskeleton  motor  proteins  [77, 1 19, 153, 154].  For  example,  FIONA  has  been  suc- 
cessfully used  to  measure  step  sizes  of  fluorescently  labeled  myosin  VI  molecules 
(Fig.  8.7a,  left  panel).  In  these  experiments  a  single  Cy5  dye  was  attached  to  one 
head  of  the  myosin  molecule  and  an  in  vitro  motility  assay  combined  with  FIONA 
was  used  to  track  myosin  VI  walking  on  actin  filaments  [116].  Single  fluorophores 
were  localized  with  nanometer  resolution  by  fitting  the  fluorescent  peak  to  a  two- 
dimensional  Gaussian  function  (Fig.  8.7a,  center  panel).  The  authors  observed  the 
movement  of  the  Cy5  fluorophore  occurring  in  discrete  steps  (Fig.  8.7a,  right  panel). 
The  average  step  size  of  myosin  VI  was  36  nm,  a  result  is  consistent  with  a  hand- 
over-hand stepping  mechanism. 

Single-molecule  FRET  techniques  have  been  applied  to  visualize  DNA  binding 
and  translocation  of  ATP-powered  enzymes.  Escherichia  coli  Rep  is  a  helicase  that 
can  translocate  on  ssDNA  in  the  3 '-5'  direction  using  ATP  hydrolysis.  Rep  helicase 
was  labeled  with  both  donor  and  acceptor  dyes  (Fig.  8.7b,  left  panel),  where  high 
FRET  and  low  FRET  would  represent  closed  and  open  conformations  of  helicase, 
respectively  [108].  Single-molecule  FRET  experiments  revealed  that  Rep  gradually 
closes  as  it  approaches  the  duplex  junction  and  abruptly  opens  up  when  it  snaps  back 
and  starts  another  translocation  process  (Fig.  8.7b,  right  panel).  Experiments  with 
donor-labeled  Rep  helicase  and  acceptor-labeled  DNA  duplex  showed  repeatable 
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Fig.  8.7  Examples  of  biological  applications  of  single-molecule  fluorescence  methods,  (a)  Tracking 
the  motion  of  myosin  VI  walking  along  actin  filaments.  Left:  Sequence  of  events  during  hand-over- 
hand walking  of  myosin  VI.  As  a  result  of  ATP  hydrolysis  myosin  head  moves  36  nm.  The  trailing 
head  (yellow)  switch  its  place  with  previously  leading  head  (green).  Myosin  molecules  are  labeled 
with  Cy5  attached  to  one  calmodulin.  Center.  Single  photon  localization.  The  centers  of  individual 
Cy5  dyes  were  calculated  by  a  curve  fitting  to  a  two-dimensional  (2D)  Gaussian  function  which 
allowed  measurement  of  displacement  of  the  labeled  motor  domain.  Right:  Staircases  of  three  differ- 
ent fluorescently  labeled  myosin  VI  molecules.  The  experiments  were  done  in  40  uM  concentration 
of  ATP.  Reproduced  with  permission  from  Okten  etal.,  Nature  [1 16].  (b)  Left:  Diagram  of  the  trans- 
location of  Rep  helicase  along  ssDNA.  Cy3 -labeled  Rep  (donor)  binds  to  a  ssDNA  and  moves  along 
it  from  3'  end  towards  acceptor-labeled  duplex  DNA.  Right:  b,  c,  Fluorescence  FRET  intensity  traces 
for  Rep  translocation  along  the  ssDNA.  When  donor-labeled  Rep  binds  to  3'  end  of  ssDNA  fluores- 
cence donor  signal  rises  sharply.  Then  it  steadily  decreases  which  corresponds  to  Rep  translocation 
along  ssDNA  towards  acceptor-labeled  duplex  DNA.  Decrease  in  acceptor  fluorescence  signal  is 
accompanied  by  increase  of  acceptor  fluorescence  signal.  Fluorescence  traces  shows  cycles  of  FRET 
increases  and  decreases  suggesting  repeatable  cycles  of  Rep  translocation  along  the  ssDNA. 
Experiments  were  done  at  22  °C  (b)  and  37  °C  (c).  Reproduced  with  permission  from  [108] 

cycles  of  helicase  translocation  along  the  ssDNA  and  its  snapback.  When  both  ends  of 
ssDNA  were  labeled,  FRET  traces  revealed  formation  of  short  lived  ssDNA  loops. 

RNA  is  involved  in  storage  of  information  and  catalysis  which  makes  it  very 
versatile  molecule.  Analysis  of  folding  and  catalytic  activity  of  terahymena  ribo- 
zyme  was  performed  by  observing  the  FRET  changes  occurring  between  fluores- 
cent dyes  located  at  3'  and  5 '  ends  of  surface-immobilized  ribozyme  [161].  The 
experiments  allow  the  observation  of  rarely  docked  states  previously  not  observed 
by  ensemble  methods  and  discovery  of  new  folding  pathway.  The  applied  method 
enabled  the  determination  of  rate  constants  and  added  new  insight  in  our  under- 
standing of  transition  states. 
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8.6    Single-Particle  Cryo-Electron  Microscopy 


Cryo-electron  microscopy  (cryo-EM)  recently  emerged  as  a  powerful  tool  to  study 
ultrastructure  of  biological  macromolecules  including  large  proteins,  protein  com- 
plexes, and  nucleic  acids.  This  single-particle  method  entails  the  purification  of  the 
biomolecules,  spotting  them  onto  an  EM  grid,  flash  freezing  the  sample  (cryogeni- 
cally),  and  then  imaging  the  frozen  sample  using  an  electron  microscope.  The  col- 
lected raw  data  consists  of  many  (hundreds  or  thousands)  images  of  the  same 
macromolecule  in  different  orientations,  which  are  then  processed  to  produce  a 
three-dimensional  model.  The  big  advantage  of  electron  microscopy  is  high  resolv- 
ing power  of  electron  optics  allowing  the  examination  of  fine  structure  details. 
Modern  electron  microscopes  allow  filling  the  gap  between  light  optics  resolving 
object  details  on  a  micron  scale  (10~6  m)  and  X-ray  crystallography  capable  of  solv- 
ing structure  of  crystallized  samples  with  0.1-0.3  nm  precision  (10~9  m).  Actually 
cryo-EM  nowadays  is  closing  in  on  the  X-ray  crystallography  "monopoly"  in  that 
range  and  is  yielding  structures  with  comparable  resolution  of  0.35-0.4  nm  [12,  54, 
130,  156,  157,  159].  The  current  limit  of  resolution  of  cryo-EM  in  terms  of  molecu- 
lar mass  is  about  200  kDa  [117];  however,  it  is  possible  to  resolve  single  proteins 
within  macromolecular  assemblies  (e.g.,  ion  channels,  viruses)  to  near-atomic 
(0.33-0.46  nm)  resolution  [12]. 


8. 6. 1    Basic  Principles 

Electron  microscope  is  analogous  to  traditional  light  optical  microscope  in  concept 
but  uses  electrons  to  "shine"  onto  the  samples  and  to  form  images  (Fig.  8.8).  That 
becomes  possible  owing  to  low  mass  of  electrons  and  their  electric  charge  allowing 
electromagnetic  fields  to  bend  their  trajectories  and  to  form  images  like  in  the  light 
optics  case.  In  addition  to  being  discrete  particles  electrons  have  wave  properties  as 
well.  Their  wavelength  depends  on  their  energy  and  can  be  calculated  using  the  de 
Broglie  equation,  X-hlp,  where  h  is  Planck's  constant  and p  the  relativistic  momen- 
tum of  the  moving  electron.  X  is  called  the  de  Broglie  wavelength  and  is  usually 
extremely  small.  To  calculate  numerical  values  of  X  for  electrons  used  in  electron 
microscopy  let  us  find  dependency  of  X  on  high  tension  in  an  electron  microscope. 
For  electrons  accelerated  in  electric  field  with  U  potential  difference,  their  velocity 

is  defined  by  the  formula:  v  =  I  e^  .  Here  m  is  the  mass  of  the  electron  and  e  is  its 
charge.  Therefore,  V  m 
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Fig.  8.8  Schematic  diagram  of  an  EM  instrument,  (a)  An  electron  beam  is  generated  by  an  elec- 
tron gun  (electron  source)  which  is  then  scattered  by  a  sample  and  focused  by  an  objective  lens  of 
the  microscope  creating  an  image,  (b)  The  cryo-EM  method  entails  the  purification  of  the  biomol- 
ecules  and  an  application  of  suspension  of  biomolecules  in  a  solvent  to  an  EM  grid  with  a  holey 
carbon  film  (thin  carbon  film  with  small  holes)  to  form  very  thin  layers  of  the  suspension.  Then  the 
sample  is  flash  frozen  in  a  cryogen,  followed  by  its  imaging  using  an  electron  microscope 

Electrons  are  typically  accelerated  in  electron  microscopes  to  60-300  keV 
energy  and  their  wavelengths  are  ranging  from  -4.8  pm  (10~12  m)  to  ~2  pm.  Electron 
microscopes  have  electron  source,  condenser  system,  objective  lens,  and  a  number 
of  projector  lenses  to  magnify  an  image  formed  by  the  objective  lens.  Instead  of 
glass  used  to  make  lenses  in  light  optics,  electron  microscopes  use  electromagnetic 
lenses  that  are  just  copper  windings  with  magnetic  field  concentrators  made  of  soft 
iron  that  is  ferromagnetic  in  nature.  The  field  inside  a  lens  is  quite  strong  reaching 
strength  of  several  T(Tesla)  (typical  refrigerator  magnet  produces  fields  of  only 
~5  mT).  These  strong  fields  are  used  to  focus  electrons  both  to  illuminate  very  small 
regions  in  the  specimens  and  to  form  their  images. 

The  resolving  power  of  electron  microscopes  is  much  higher  compared  to  light 
microscopes  owing  to  much  shorter  wavelength  of  electrons  used  to  illuminate  the 
samples.  While  the  wavelengths  of  visible  light  used  in  light  optics  range  from  -400 
to  700  nm,  the  wavelength  of  electrons  used  in  electron  microscopes  is  typically 
shorter  than  0.003  nm.  According  to  Rayleigh  criterion  of  resolution  for  a  diffrac- 
tion limited  optical  system  (system  with  ideal  optics): 

<5  =  \22XflD  where  a  is  the  resolution,  X  is  the  wavelength  of  the  illuminating 
beam,  D  is  the  diameter  of  the  aperture,  and /is  focal  length  of  the  objective  lens. 
Since///)  is  proportional  to  1/NA  it's  easy  to  see  that  the  resolution  in  EM  should  be 
many  orders  of  magnitudes  better  than  in  light  optics.  In  practice  though  EM 
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resolution  is  worse  than  the  value  predicted  by  Rayleigh  criterion  owing  to  numerous 
aberrations  present  in  the  electron  optics,  radiation  damage  of  specimens  under 
study,  and  various  instabilities  present  in  the  microscope  and  in  environment.  The 
latter  include  mechanical  vibrations,  acoustical  noise,  stray  electromagnetic  fields, 
temperature  variations  of  the  microscope  and  in  the  microscope  room,  air  pressure 
changes,  etc.  Diffraction  limited  electron  optics  would  have  -0.01  nm  or  better  reso- 
lution while  the  practical  resolution  limit  in  electron  microscopy  in  biology  currently 
at  the  very  best  is  at  0.2-0.3  nm  level. 

Because  of  the  strong  electron  interaction  with  surrounding  molecules,  one  of 
the  major  requirements  for  electron  microscope  design  is  to  provide  an  ultrahigh 
vacuum  inside  of  the  lenses  along  the  beam  path.  Therefore  the  microscopes  have 
their  own  vacuum  systems  with  several  pumps  constantly  pumping  out  residual  gas 
molecules  to  prevent  undesirable  electron  scattering  before  and  after  the  beam  hits 
the  specimen.  The  vacuum  should  be  good  enough  to  allow  electrons  to  travel  at 
least  the  length  of  the  microscope  column  (usually  more  than  a  meter  in  length) 
without  seeing  residual  gas  molecules.  And  that  means  high  or  ultrahigh  vacuum 
with  the  pressure  readings  less  than  10~5-10~7  Torr  (10~3-10~5  Pa). 

More  than  25  years  ago  EM  moved  to  a  new  era  when  investigators  realized  that 
ultrafast  specimens  cooling  could  preserve  them  in  a  vitrified  buffer.  The  cooling 
rate  though  should  be  so  fast  that  ice  crystals  would  not  have  time  to  form;  and  that 
is  very  difficult  to  achieve.  First  of  all  the  specimens  must  be  very  thin  to  begin  with, 
less  than  1,000  nm  in  thickness.  That  excludes  most  of  the  cells  and  definitely  tis- 
sues leaving  the  field  with  single  particles  that  are  single  protein  or  nucleic  acid 
molecules,  or  their  aggregates  or  complexes,  cellular  organelles,  viruses,  etc. 
Secondly,  specimens  should  be  cooled  to  at  least  -150  °C  to  prevent  water  crystal- 
lization since  vitreous  water  is  stable  only  at  the  temperatures  below  -140  °C.  This 
requires  using  liquid  cryogens  (such  as  ethane  or  propane)  with  a  very  high  specific 
heat  constant  and  allows  cooling  down  the  samples  very  effectively  and  fast  (with 
up  to  1067s).  The  vitrification  preserves  the  biomolecules  in  nearly  native  state  and 
creates  a  temporal  snapshot  of  all  the  biomolecules  in  the  sample  since  they  were 
immobilized  on  a  microsecond  time  scale. 


8.6.2    Imaging  Single  Particles  and  Biomolecules 

Since  electrons  are  invisible  to  human  eye,  the  images  formed  by  electron  optics 
should  be  transformed  to  visible  light  images  either  by  direct  conversion,  or  by  regis- 
tering them  by  a  sensor  followed  by  conversion  to  visible  to  eye  images,  in  both  cases 
the  images  are  made  useable  for  human  perception.  One  way  to  convert  an  electron 
image  into  visible  to  eye  is  to  use  phosphors  that  emit  photons  (fluoresce)  when  elec- 
trons hit  them.  Fluorescent  screens  are  commonly  used  in  electron  microscope  to 
directly  observe  images  and  photographic  film  was  used  as  an  image  detector. 
Nowadays  photographic  films  are  commonly  replaced  by  digital  electron  detectors 
(typically  CCD  or  CMOS  cameras)  allowing  to  display  electrons  immediately  after 
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Single  WEE  virion 


3D  reconstruction  of 
WEEV  using  5,000 
particle  images 


Central  section 
through  the  map 


Fig.  8.9  Principle  of  single-particle  reconstruction  from  2D  images,  (a)  In  the  cryo-EM  a  coherent 
electron  beam  is  shined  onto  a  sample  consisting  of  particles  embedded  in  random  orientations  in 
the  vitreous  ice.  The  collected  images  are  2D  projections  of  the  3D  molecules.  The  collected  raw 
data  consists  of  many  (hundreds  or  thousands)  images  of  the  same  macromolecule  in  different 
orientations,  which  are  then  processed  to  produce  a  3D  model.  Reproduced  with  permission  from 
[99].  (b)  Power  of  averaging.  Left  panel:  raw  image  of  a  single  virus  particle  (western  equine 
encephalitis  virus,  WEEV,  left);  central  panel:  3D  reconstruction  obtained  by  combining  5,000 
individual  raw  images;  right  panel:  view  of  the  central  cross-section  through  the  reconstructed  3D 
image.  The  combination  of  these  raw  noisy  individual  WEEV  images  results  in  about  70-fold 
increase  in  the  signal  of  the  3D  map 


acquisition  on  a  computer  screen.  Typically  a  very  thin  phosphor  layer  is  used  in  these 
cameras  to  convert  electrons  to  visible  light,  which  is  then  registered  by  the  digital 
sensor.  Recently  a  new  class  of  electron  detectors  was  developed,  the  direct  electron 
detection  devices,  allowing  registering  electron  images  without  their  intermediate 
conversion  to  light. 

EM  images  are  "flat";  they  represent  two-dimensional  (2D)  projections  of  three- 
dimensional  (3D)  samples  (Fig.  8.9).  These  projections  are  not  just  shadows  origi- 
nating from  a  shape  of  an  object  but  rather  true  projections  formed  by  summing  all 
the  densities  within  a  sample  along  the  projection  direction.  That  means  that  infor- 
mation about  the  third  dimension  in  the  specimen  is  not  readily  accessible  in  an 
image,  which  is  disappointing  since  one  would  like  to  know  3D  structure  of  the 
samples  under  study.  Fortunately  the  information  is  not  lost  completely;  it  is  still 
possible  to  reconstruct  the  3D  structure  of  a  sample  using  a  number  of  its  2D  projec- 
tions— images  (Fig.  8.9a).  That  was  first  formulated  and  proven  by  an  Austrian 
mathematician  Johann  Radon  back  in  1917  [123].  For  an  accurate  reconstruction 
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many  projections  are  needed,  the  more,  the  better.  At  the  same  time  combining  large 
number  of  images  allows  one  to  greatly  reduce  noise  in  the  reconstruction  since  that 
effectively  increases  the  electron  dose  and  consequently  suppresses  the  shot  noise  in 
the  reconstruction  enhancing  the  signal,  improving  the  quality  of  the  reconstruction 
(Fig.  8.9b).  Mathematically  speaking  the  noise  reduces  proportionally  to  the  square 
root  of  the  number  of  individual  images  combined  in  a  reconstruction,  so  if  10,000 
images  were  used  to  reconstruct  a  volume,  the  noise  is  suppressed  by  100  times. 


8.6.3    Image  Processing 

Once  an  image  set  is  acquired  in  an  imaging  session,  the  images  have  to  be  processed 
to  obtain  a  3D  reconstruction.  First,  an  image  is  assessed  for  defects  that  impair  the 
image  quality  (e.g.,  specimen  drift  during  exposure)  and  bad  images  are  discarded 
from  further  analysis.  The  effect  of  the  microscope  optics  on  image  formation  is  usu- 
ally described  by  a  "contrast  transfer  function"  (CTF),  which  is  an  oscillating  function 
affecting  image  Fourier  transforms.  The  transforms  have  zero  amplitudes  in  places 
where  CTF  is  zero  and  since  CTF  depends  on  the  defocus  used  to  acquire  an  image,  it 
is  a  good  idea  to  collect  images  within  a  range  of  defocus  values.  Then  information 
that  is  lost  in  a  particular  image  (zero  amplitude)  could  be  extracted  from  other  images 
where  CTF  has  large  values  in  that  region  of  Fourier  space.  Good  images  are  corrected 
for  the  CTF  and  then  individual  particles  images  are  boxed  out  from  them. 

A  critical  assumption  in  image  processing  of  single  particles  is  that  all  the  par- 
ticle images  represent  the  same  3D  object.  It  then  becomes  possible  to  implement 
an  extremely  powerful  idea  of  averaging  or  combining  information  from  many  dif- 
ferent images  to  create  a  single,  much  less  noisy,  reconstruction  of  the  object.  If  that 
assumption  fails,  none  of  the  processing  would  work  since  one  in  such  a  case  would 
try  to  compare  apples  and  oranges  and  the  result  would  correspond  to  neither. 
Combining  information  from  many  images  is  necessary  for  many  reasons:  individ- 
ual images  are  so  noisy  that  signal  is  buried  frequently  in  the  noise;  CTF  correction 
fails  in  some  regions  of  Fourier  space  and  information  is  lost  in  those;  individual 
images  represent  single  projections  of  a  3D  object,  consequently  to  restore  the 
object  in  three  dimensions  one  needs  a  large  number  of  these  projections  at  all  pos- 
sible orientations.  It  is  usually  assumed  that  particles  are  embedded  in  ice  in  random 
orientations  so  selecting  many  different  images  would  provide  many  different  ori- 
entations of  an  object  they  represent. 

There  are  several  groups  of  reconstruction  algorithms  that  are  used  in  cryo-EM. 
There  are  Fourier  space-based  methods  where  3D  Fourier  transform  of  an  object  is 
reconstructed  from  the  information  from  Fourier  transforms  of  individual  images 
followed  by  Fourier  inversion  back  to  the  object  space  [32].  In  back-projection  algo- 
rithms one  uses  the  idea  that  a  superposition  of  projections  stretched  along  their 
corresponding  projection  directions  would  produce  a  reconstruction  of  the  original 
object.  There  are  algebraic  reconstruction  algorithms  where  reconstructions  are  cal- 
culated iteratively  with  reconstruction  errors  diminishing  at  each  iteration  until  the 
process  converges  [47,  53,  60,  79]. 
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Fig.  8.10  3 -Dimensional  reconstructions  of  GroEL.  (a)  Cryo-EM  image  of  GroEL  molecules 
embedded  in  vitreous  water  shows  a  very  low  signal-to-noise  ratio,  (b)  3D  reconstruction  from  a 
set  of  135  micrographs  resulting  in  a  0.42  nm  resolution  map  (with  a  monomer  highlighted  in  red). 
True  D7  symmetry  was  used  for  reconstruction,  (c)  0.47  nm  reconstruction  of  the  same  sample 
using  lower  C7  symmetry.  Independent  "A"  and  "B"  rings  with  monomers  shown  in  blue  and  yel- 
low, respectively.  Reproduced  with  permission  from  Ludtke  et  al.  [87] 

8.6.4   Examples  of  3D  Reconstructions  Using  Cryo-EM 
Techniques:  Chaperonins,  Ribosomes,  and  Viruses 

GroE  is  a  chaperonin  that  is  required  for  the  proper  folding  of  a  wide  range  of 
proteins  [38,  94].  It  has  been  proposed  that  GroE  acts  as  an  Anfinsen  cage  providing 
the  proper  chemical  environment  to  promote  protein  folding  [37].  GroE  is  com- 
posed of  two  proteins:  GroEL  and  GroES.  GroEL  forms  two  back- to-back  seven- 
membered  rings  that  are  responsible  for  accepting  and  folding  of  a  polypeptide 
chain  [19,  25].  Figure  8.10  shows  an  example  how  cryo-EM  techniques  are  used  to 
reconstruct  GroEL  to  -0.4  nm  resolution  [87].  The  raw  2D  micrograph  shows  indi- 
vidual GroEL  molecules  which  are  hard  to  resolve  because  of  the  very  low  signal- 
to-noise  ratio  (Fig.  8.10a).  Figure  8.10b,  c  show  3D  reconstructions  using  a  large  set 
of  micrographs  resulting  in  a  0.42  nm  resolution  image  (a  single  subunit  is  high- 
lighted in  red).  This  represents  a  significant  milestone  in  resolution  for  low- symme- 
try, single-particle  cryo-EM  maps  [87]. 

The  ribosome  structure  has  been  studied  by  EM  for  a  long  time,  since  the 
1960s.  Cryo-EM  and  image  processing  allow  now  to  reveal  mechanism  of  pro- 
tein synthesis  followed  by  protein  folding  with  the  help  of  SecY-Sec61 
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complex  that  translocates  nascent  secretory  proteins  across  cellular  membranes 
and  integrates  membrane  proteins  into  lipid  bilayers.  Typical  resolutions 
achieved  using  single-particle  cryo-EM  are  better  than  1  nm,  in  some  cases 
approaching  0.6  nm,  and  allowing  to  segment  different  ribosome  components, 
mRNA  and  tRNAs  along  with  elongation  factors  used  in  protein  synthesis.  It  is 
an  amazing  achievement  for  cryo-EM  since  ribosome  does  not  have  any  sym- 
metry to  improve  the  quality  of  the  reconstruction  [48]. 

Icosahedral  viruses  have  the  highest  intramolecular  symmetry  in  biology,  having 
a  60-fold  symmetry,  so  that  a  single  cryo-EM  image  is  equivalent  to  60  images  of 
asymmetric  particles.  In  addition,  these  viruses  are  often  well  preserved  and  several 
viral  structures  have  been  reconstructed  to  better  than  0.4  nm  resolution.  A  cryo-EM 
structure  of  an  infectious  subvirion  particle  of  aquareovirus  was  recently  reported  at 
0.33  nm  resolution  and  revealed  side-chain  densities  leading  to  de  novo  construc- 
tion of  a  full-atom  model  of  the  viral  particle  [158]. 


8.7  Perspectives 


Single-molecule  methods  are  providing  us  more  and  more  fundamental  information 
on  the  structure  and  function  of  proteins.  The  progress  in  the  visualization  and  manip- 
ulation of  single  proteins  during  the  last  20  years  has  been  impressive.  The  implemen- 
tation of  single-molecule  methods  by  biochemists  and  biophysicists  has  grown  almost 
exponentially;  from  a  handful  of  papers  published  in  the  early  1990s  to  nearly  1,400 
papers  published  just  in  201 1  (Web  of  Knowledge).  Clearly,  single-molecule  methods 
are  becoming  an  indispensable  tool  to  understand  how  proteins  work  in  real  time. 
However,  most  single-molecule  methods  have  an  inherent  problem:  they  are  mainly 
concerned  with  in  vitro  studies  of  purified  proteins  that  are  removed  from  the  cellular 
environment.  The  key  future  challenge  is  to  bring  single-molecule  methods  into  living 
cells.  This  is  not  an  easy  task  since  it  will  require  the  combination  of  several  single- 
molecule  methods  and  bring  together  nanoscience,  biophysics,  and  cell  biology.  For 
example,  the  combination  of  single-molecule  manipulation  techniques  (e.g.,  tweezers 
or  AFM)  and  single-molecule  detection  techniques  (e.g.,  fluorescence)  is  an  important 
development  [51,  67,  80,  84,  95]  and  should  enable  us  to  tackle  more  complex  prob- 
lems. Through  the  information  unveiled  by  the  different  single-molecule  methods  we 
are  entering  a  new  and  exciting  time  in  biology  which,  in  combination  with  the  knowl- 
edge generated  in  this  proteomic  era,  is  likely  to  move  us  closer  to  understanding  how 
proteins  work  in  living  cells. 
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Chapter  9 

Helicase  Unwinding  at  the  Replication  Fork 

Divya  Nandakumar  and  Smita  S.  Patel 


Abstract  Ring-shaped  hexameric  helicases  play  an  essential  role  of  double- 
stranded  DNA  unwinding  during  genome  replication.  The  NTPase-powered 
unwinding  activity  of  the  hexameric  helicases  is  required  both  for  replication  initia- 
tion and  fork  progression.  We  describe  ensemble  biophysical  methods  to  measure 
the  unwinding  activity  of  ring-shaped  helicases  during  fork  progression  using  the 
T7  bacteriophage  replicative  helicase  gp4A'  as  a  model  enzyme.  These  assays  pro- 
vide insights  into  the  stepping  mechanism  of  translocation,  active  or  passive  mecha- 
nism of  unwinding,  and  regulation  by  associated  proteins  such  as  single  strand  DNA 
binding  protein,  DNA  polymerase,  and  primase  enzymes. 

Keywords  Hexameric  helicase  •  Replication  •  DNA  unwinding  •  T7  bacteriophage 
•  DNA  polymerase  •  SSB  •  Primase  •  Unwinding  assays  •  gfit  •  Global  regression 
analysis  •  Unwinding  mechanism 


9.1  Introduction 

Helicases  are  molecular  motor  proteins  that  unwind  double- stranded  (ds)  DNA  into 
single  strands  using  the  energy  of  NTP  hydrolysis.  This  activity  is  required  not  only 
for  DNA  replication  but  also  for  DNA  repair,  recombination,  and  transcription  [1- 
3].  Helicases  are  found  from  viruses  to  humans  where  they  play  essential  roles  in 
practically  all  DNA  and  RNA  metabolic  processes.  Therefore,  helicases  are  attrac- 
tive targets  for  developing  new  antiviral,  antibacterial,  and  therapeutic  agents.  In 
humans,  defects  in  helicase  activity  due  to  mutations  cause  genome  instability  that 
can  ultimately  result  in  diseases  such  as  cancer  and  premature  aging. 
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DNA  unwinding  is  a  multistep  process  involving  both  chemical  and  mechanical 
steps.  To  initiate  unwinding,  the  helicase  must  bind  DNA  and  assemble  into  its  func- 
tional monomeric/multimeric  structure  on  the  DNA.  To  unwind  DNA,  the  helicase 
binds  to  NTP  and  hydrolyzes  it  forming  intermediates  that  stabilize  various  confor- 
mational changes  that  support  helicase  translocation  along  single-stranded  DNA 
(ssDNA)  and  separation  of  the  dsDNA.  The  NTPase  and  translocation/unwinding 
cycling  continue  while  the  helicase  remains  attached  to  the  DNA,  leading  to  proces- 
sive  separation  of  the  dsDNA  strands.  To  understand  helicase  mechanisms,  each  step 
is  kinetically  and  thermodynamically  characterized  and  the  relationships  and  depen- 
dencies between  the  various  steps  are  determined.  Additionally,  this  information  is 
related  to  high-resolution  structures  of  helicase  complexes  with  DNA  and  NTP. 
Biochemical  and  biophysical  methods  such  as  pre-steady  state  kinetics,  X-ray  crystal- 
lography, electron  microscopy,  single  molecule  kinetics,  and  advancements  in  com- 
putational biophysics  have  been  critical  in  enhancing  our  understanding  of  the 
functioning  of  helicases  [4-6]. 

A  class  of  helicases  distinguished  by  their  ring-shaped  structure  is  found  widely 
to  be  involved  in  genome  replication.  These  ring-shaped  helicases  assemble  from 
six  identical  subunits  in  most  organisms,  except  in  eukaryotes  where  the  replicative 
hexameric  helicase  mini  chromosome  maintenance  protein  (MCM  2-7)  assembles 
from  six  different  subunits  [7,  8].  The  ring-shaped  assembly  creates  a  central  chan- 
nel that  binds  DNA  and  the  topological  linkage  of  the  DNA  and  the  ring  confers 
high  processivity  to  this  class  of  helicases  allowing  them  to  unwind  long  segments 
of  DNA  for  efficient  replication.  The  central  channel  is  flexible  and  it  can  accom- 
modate ssDNA  or  dsDNA  and  many  hexameric  helicases  can  translocate  on  both 
types  of  DNA  [9,  10]. 

The  ssDNA  translocation  activity  of  hexameric  helicases  is  important  for  DNA 
unwinding  and  is  powered  by  the  NTPase  reaction  that  occurs  at  the  active  sites  located 
at  the  hexamer  subunit  interfaces  [11-14].  The  direction  of  translocation  can  be  either 
5'  ->  y  or  y  ->  5'  depending  on  the  helicase.  Recent  biochemical  [15, 16]  and  structural 
studies  support  an  ordered  sequential  mechanism  of  NTP  hydrolysis  around  the  ring. 
The  crystal  structure  of  bovine  papilloma  virus  El  and  Escherichia  coli  Rho  helicase 
indicates  that  at  any  given  time,  five  consecutive  subunits  interact  with  five  consecutive 
nucleotides  of  ssDNA  (or  ssRNA)  in  a  spiral  staircase  conformation  [13, 14].  Each  sub- 
unit  is  in  a  distinct  NTP  ligation  state  as  it  goes  through  the  steps  of  NTP  binding,  hydro- 
lysis, Pi  release,  and  NDP  release  steps  in  a  sequential  manner.  During  translocation, 
NTP  binding  promotes  a  subunit  at  the  top  of  the  staircase  to  bind  to  a  free  nucleotide  of 
ssDNA  while  at  the  same  time  NDP  release  promotes  its  neighboring  subunit  to  release 
its  nucleotide  of  ssDNA.  Thus,  directional  movement  of  helicase  on  ssDNA  is  akin  to  a 
wheel  (helicase)  rolling  down  the  road  (ssDNA)  without  slipping  through  interactions 
within  the  central  channel  of  the  helicase. 

To  separate  the  strands  of  dsDNA,  the  helicase  must  couple  its  ssDNA  transloca- 
tion activity  to  base  pair  melting  at  the  replication  fork  junction.  Almost  all  hexa- 
meric helicases  unwind  the  fork  DNA  by  a  strand  exclusion  model  [10,  17-22].  In 
this  model,  the  helicase  ring  translocates  along  one  of  the  ssDNA  strands  that  it 
binds  in  its  central  channel  while  excluding  the  complementary  strand  (Fig.  9.1). 
Strand  exclusion  is  important  because  if  the  helicase  ring  surrounds  the  dsDNA 
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Fig.  9.1  Strand  exclusion 
model  of  helicase  unwinding. 
A  representation  of  the  strand 
exclusion  model  in  which  the 
ring-shaped  helicase  binds 
the  translocating  strand  in  the 
central  channel  and  excludes 
the  complementary  strand  to 
separate  the  two  strands 
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instead  of  ssDNA,  then  strand  separation  does  not  occur  because  the  helicase  trans- 
locates along  dsDNA  without  unwinding.  The  coupling  of  ssDNA  translocation  and 
base  pair  melting  that  leads  to  strand  separation  can  occur  by  a  passive  or  active 
mechanism  [23].  In  a  passive  mechanism,  the  helicase  does  not  destabilize  the  junc- 
tion base  pair  but  simply  translocates  along  ssDNA  by  capturing  an  opened  base 
generated  by  thermal  fraying,  whereas  in  the  active  mechanism,  the  helicase  actively 
disrupts  the  base  pairs  at  the  junction.  We  will  discuss  methods  to  distinguish 
between  the  two  mechanisms. 

Replicative  helicases  work  in  association  with  enzymes  such  as  the  DNA  poly- 
merase and  DNA  primase  and  accessory  proteins  such  as  the  single  strand  DNA 
binding  protein  (SSB)  [24-26]  to  ensure  efficient  and  timely  replication  of  genomic 
DNA.  To  understand  the  role  of  the  helicase  during  DNA  replication,  it  is  necessary 
to  study  the  unwinding  activity  of  the  isolated  helicase  as  well  as  the  helicase  in 
complex  with  associated  proteins.  Bacteriophage  T7  provides  an  ideal  system  to 
carry  out  such  enzymological  studies  of  DNA  replication  [25].  The  replication 
machinery  of  phage  T7  is  one  of  the  simplest  consisting  of  the  helicase-primase 
protein  (T7  gp4),  DNA  polymerase  (heterodimer  of  T7  gp5  and  E.  coli  thioredoxin), 
and  single  strand  binding  protein  (T7  gp2.5).  There  are  no  accessory  loaders  or  con- 
nector proteins  required  and  efficient  DNA  replication  can  be  reconstituted  in  vitro 
from  recombinant  proteins. 

Unwinding  long  stretches  of  DNA  by  the  helicase  occurs  in  a  stepwise  manner.  In 
each  step  the  helicase  unwinds  a  certain  number  of  base  pairs  (step  size)  at  a  certain 
rate  (stepping  rate).  These  biophysical  parameters  along  with  others  such  as  the  rate 
of  unwinding  a  single  base  pair  and  how  far  the  helicase  travels  before  dissociating 
from  the  DNA  (processivity)  are  critical  for  understanding  the  mechanism  of  action  of 
this  enzyme.  In  this  chapter,  we  will  discuss  ensemble  biophysical  methods  to  charac- 
terize these  most  basic  parameters  of  DNA  unwinding  by  helicases.  Although  the 
methods  described  are  using  T7  helicase  as  an  example,  they  are  generally  applicable 
to  most  helicases.  We  will  show  that  the  single  base  pair  unwinding  rate  from  ensem- 
ble methods  can  be  used  as  a  basic  handle  to  determine  if  the  helicase  unwinds  DNA 
by  an  active  or  passive  mechanism  and  to  probe  how  helicase  activity  is  regulated  by 
single  strand  binding  protein  (SSB),  DNA  polymerase,  and  primase. 
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9.2    Ensemble  Unwinding  Assay  Conditions 

The  ensemble  assays  that  we  describe  to  measure  DNA  strand  separation  by  the 
helicase  are  all-or-none  assays  that  detect  only  the  end  product  in  the  reaction. 
These  assays  do  not  detect  partial  unwinding  of  dsDNA  for  which  other  methods 
can  be  used  [27,  28].  We  use  two  kinds  of  assays;  a  radiometric  discontinuous 
gel-based  assay  and  a  fluorescence-based  real  time  assay  to  detect  DNA  strand 
separation.  In  this  section,  we  describe  the  DNA  substrate  and  the  conditions  for 
the  assays  in  detail. 


9.2.1    DNA  Substrate 

A  typical  DNA  substrate  to  measure  the  unwinding  activity  of  replicative  ring- 
shaped  helicases  is  a  fork  DNA  that  contains  a  linear  dsDNA  region  and  two 
noncomplementary  ssDNA  overhangs  (Fig.  9.2a,  b).  This  substrate  mimics  half  of 
a  replication  fork  advancing  in  one  direction.  The  fork  DNA  substrates  are  chemi- 
cally synthesized;  therefore,  both  the  length  and  sequence  of  the  DNA  can  be 
easily  controlled.  The  phage  T7  helicase  is  a  5  '-3 '  translocase  and  strand  separa- 
tion occurs  when  the  helicase  assembles  around  the  5 ' -overhang  and  translocates 
in  the  5 '-3 '  direction  excluding  the  3 '-strand  from  its  central  channel.  Optimum 
unwinding  of  dsDNA  by  T7  helicase  requires  the  5 ' -overhang  to  be  35-nt  long  and 
3 '-overhang  to  be  15-nt,  which  needs  to  be  determined  experimentally  by  compar- 
ing the  unwinding  rates  and  processivity  values  for  DNA  substrates  with  different 
lengths  of  5'  and  3'  overhangs  [18]. 


9.2.2    Steady  State  Versus  Pre-steady  State  Kinetics 

Steady  state  kinetic  approaches  are  generally  not  suitable  for  determining  the 
unwinding  rates  of  helicases.  Under  steady  state  experimental  conditions,  the  fork 
DNA  substrate  is  used  in  excess  amount  over  the  helicase  and  one  measures  multi- 
ple rounds  of  helicase  loading,  unwinding,  and  helicase  dissociation  from  the  end. 
The  observed  rate  under  steady  state  conditions  is  dictated  primarily  by  the  slow 
rate  of  helicase  dissociation  and  reassociation.  DNA  unwinding  therefore  needs  to 
be  monitored  under  pre-steady  state  conditions  where  the  steps  of  helicase  associa- 
tion and  dissociation  are  decoupled  from  the  steps  of  DNA  unwinding. 
Synchronization  of  the  enzyme  molecules  is  important  in  determining  the  unwind- 
ing rate  for  which  the  helicase  needs  to  be  preassembled  on  the  fork  DNA  without 
having  unwinding  of  the  dsDNA  in  the  assembly  mixture. 
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Stopped -How  Assay  Quenched -Flow  Assay 


Fig.  9.2  Ensemble  DNA  unwinding  assays,  (a)  Instrumental  design,  DNA  substrate,  and  repre- 
sentative kinetic  traces  for  the  fluorescence-based  stopped-flow  assay  (a)  and  gel-based  quenched- 
flow  assay  (b)  to  measure  DNA  unwinding,  (a)  The  KinTek  stopped-flow  instrument  rapidly  fires 
reactants  from  the  two  drive  syringes  into  a  mixing  cell  for  observation  by  fluorescence  or  absor- 
bance  with  time  resolution  of  1  ms.  The  DNA  substrate  is  labeled  with  fluorescein  at  the  5'  end  of 
the  lower  strand  and  has  a  GGG  at  the  3 '  end  of  the  translocating  strand  to  quench  the  fluorescence 
when  present  as  dsDNA.  Unwinding  results  in  strand  separation  and  a  time-dependent  increase  in 
fluorescence  as  seen  in  the  sample  kinetic  trace,  (b)  The  quenched-flow  instrument  allows  rapid 
mixing  of  two  reactants  followed  by  a  delay  line  after  which  the  mixed  reactants  are  quenched.  The 
duration  of  the  reaction  is  determined  by  the  volume  of  the  delay  line  and  the  flow  rate  through  the 
delay  line.  Radiolabeling  of  the  translocating  strand  at  the  5'  end  allows  visualization  of  the  forked 
DNA  substrate  and  released  ssDNA  product  at  various  time  points  when  resolved  on  a  native  poly- 
acrylamide  gel.  The  kinetic  trace  shows  a  time-dependent  increase  in  ssDNA  fraction 


9.2.3   Assembly  of  the  Helicase  on  the  DNA 

Ring-shaped  helicases  assemble  very  slowly  on  DNA  and  many  require  special 
conditions  or  protein  loaders  to  efficiently  bind  DNA.  Although  T7  helicase  does 
not  require  a  loader  protein  to  bind  DNA,  it  needs  dTTP  to  form  hexameric  rings 
and  to  bind  DNA  but  can  do  so  without  Mg2+  [29].  Since  Mg2+  is  required  for 
dTTP  hydrolysis,  we  prevent  DNA  unwinding  in  the  preassembly  mixture  by 
leaving  out  Mg2+  and  adding  it  to  initiate  the  reaction.  Single  round  conditions  are 
established  by  including  a  trap  such  as  SSB  protein  that  would  bind  free  ssDNA 
or  excess  ssDNA  that  would  bind  free  helicase.  The  optimal  concentration  of  the 
trap  is  determined  experimentally  by  adding  increasing  amount  of  the  trap  in  the 
preassembly  mixture. 
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It  is  important  to  note  that  SSB  can  potentially  have  other  effects  on  the  unwind- 
ing mechanism  apart  from  acting  as  a  DNA  trap  (Sect.  9.6).  When  using  ssDNA  to 
trap  excess  helicase,  the  commonly  used  traps  are  the  5 '-strand  or  3 '-strand  itself  or 
a  polynucleotide  DNA  such  as  d!T(30-90).  With  some  helicases,  the  specific  ssDNA 
used  as  the  trap  can  influence  the  unwinding  mechanism  by  the  helicase  [30].  Thus, 
it  is  important  to  ensure  that  the  traps  do  not  affect  the  mechanism  of  unwinding  by 
verifying  that  different  traps  have  a  uniform  effect  on  unwinding. 


9.3    Unwinding  Assays 

T7  helicase  unwinds  short  fork  DNA  substrates  ranging  in  length  from  20  to  90  bp 
within  milliseconds  time  intervals.  Hence,  the  pre-steady  state  single  round  kinetic 
measurements  are  carried  out  in  rapid  mixing  apparatus  such  as  the  stopped-flow  or 
quenched-flow  instruments  (Fig.  9.2a,  b). 


9.3.1    Real  Time  Unwinding  Assay 

To  measure  DNA  unwinding  in  real  time,  the  fork  DNA  substrate  is  modified  with 
a  fluorophore  (such  as  fluorescein),  incorporated  at  the  5 '-end  of  the  displaced 
strand.  When  the  DNA  is  double- stranded,  the  fluorescein  fluorescence  is 
quenched  by  a  string  of  three  guanosines  introduced  in  the  3'  end  of  the  comple- 
mentary strand  [31,  32].  When  the  helicase  unwinds  the  dsDNA  and  the  displaced 
strand  becomes  single- stranded,  the  fluorescein  moves  away  from  the  string  of  Gs 
and  the  fluorescence  increases  when  the  strands  are  fully  separated.  This  results  in 
a  time-dependent  increase  in  fluorescence,  which  is  measured  continuously  in  a 
stopped-flow  apparatus  (Fig.  9.2a). 


9.3.2    Gel-Based  Unwinding  Assay 

The  gel-based  assay  is  discontinuous  where  one  of  the  DNA  strands  is  5 '-end  labeled 
with  32Pi  allowing  detection  of  the  fork  DNA  substrate  and  the  ssDNA  product 
(Fig.  9.2b).  The  reactions  are  mixed  in  a  quenched-flow  instrument  and  the  unwinding 
reaction  is  stopped  after  defined  periods  with  EDTA  and  sodium  dodecyl  sulfate  (SDS) 
that  dissociates  the  helicase  from  the  DNA.  The  quenched  reactions  are  analyzed  by 
native  polyacrylamide  gel  electrophoresis  that  resolves  dsDNA  from  the  fully  unwound 
ssDNA,  which  are  quantified  using  the  phosphorimager  software  ImageQuant  (GE 
Healthcare).  The  partially  unwound  intermediates  reanneal  after  the  unwinding  reaction 
is  stopped.  The  only  intermediates  that  may  not  reanneal  and  fall  apart  into  ssDNAs 
after  quenching  are  the  ones  where  only  a  very  short  portion  of  dsDNA  remains  to  be 
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unwound.  This  short  portion  of  dsDNA  is  defined  as  the  minimal  duplex  length  (Lm)  and 
it  can  be  determined  experimentally  as  described  below  (Sect.  9.4.2). 

We  describe  only  two  ensemble  experiments,  but  variations  of  these  have  been 
successfully  used  for  studying  other  helicases  [33-35].  Of  the  assays,  the 
fluorescence-based  assay  has  the  advantage  of  being  real  time,  high  throughput, 
providing  a  large  number  of  data  points  to  accurately  determine  the  unwinding  rate, 
and  has  a  smaller  Lm  compared  to  the  gel-based  assay  (explained  below).  The  radio- 
metric assay  is  discontinuous,  but  it  provides  a  reliable  way  to  determine  the  ampli- 
tude or  the  fraction  of  DNA  unwound  at  any  given  time.  The  amplitude  is  used  to 
determine  the  helicase  processivity,  which  estimates  how  far  the  helicase  moves 
along  the  DNA  before  it  falls  off. 


9.4    The  Unwinding  Kinetic  Trace 

The  pre-steady  state  single  round  kinetics  of  unwinding  shows  an  initial  lag  followed 
by  a  steep  increase  in  ssDNA  product  formation  that  plateaus  after  a  certain  time.  The 
initial  lag  represents  the  time  the  helicase  takes  to  unwind  the  dsDNA  region  before 
the  strands  become  fully  separated.  The  lag  time  depends  on  the  dsDNA  length  and  it 
increases  as  the  length  of  the  dsDNA  increases  (Fig.  9.3a).  The  steady  increase  in  lag 
time  with  increasing  dsDNA  length  is  a  good  indication  that  the  assay  is  measuring 
the  steps  of  unwinding  rather  than  rate-limiting  initiation.  The  plateau  in  the  trace 
represents  the  population  of  DNA  strands  that  are  fully  separated  at  the  end  of  the 
reaction.  As  the  trace  represents  an  all-or-none  unwinding  reaction,  the  slope  should 
typically  be  steep  corresponding  to  the  synchronized  generation  of  ssDNA.  A  shallow 
increase  observed  under  certain  conditions  or  with  certain  DNA  substrates  is  an  indi- 
cation of  a  distribution  of  enzyme  molecules  over  the  intermediate  steps  in  the  reac- 
tion, generated  by  helicase  pausing  or  nonuniform  stepping  rates. 


9.4.1    Quantifying  the  Assembled  Helicase-DNA  Complex 

To  prevent  the  newly  unwound  DNA  strands  from  reannealing  during  the  course  of 
the  reaction,  the  fork  DNA  concentration  is  kept  to  a  minimum  (low  nanomolar 
range).  This  is  especially  important  when  the  unwinding  rate  is  slow  and  reactions 
are  monitored  for  minutes.  The  concentration  of  helicase  is  then  adjusted  to  saturate 
the  fork  DNA  substrate.  To  determine  the  concentration  of  helicase  to  add  in  the 
preassembly  mixture,  one  needs  to  know  the  dissociation  constant  (Kd)  of  the  heli- 
case-fork  DNA  complex  under  conditions  of  the  experiment.  The  dissociation  con- 
stant is  a  measure  of  strength  of  binding  or  affinity  of  an  enzyme  for  its  substrate. 
We  describe  here  a  method  to  determine  the  apparent  Kd  or  Km  of  T7  helicase-fork 
DNA  complex  in  the  presence  of  dTTP  using  the  unwinding  assay  itself  (Fig.  9.3b). 
The  measured  value  is  the  apparent  Kd  as  the  readout  for  the  experiment 
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Sample  Kinetic  Traces 


i  n 

1— 

0.8 

o 

-  ■ — 1 

— ' 

ae 

< 

0.4  ' 

0,2  ■ 

as 

0.0  * 

/*  wtfr  separated  ssDNA 


Optimum  Enzyme  Concentration  Determination 

5  0.6 


0,01       0J  li/2    I  10 
Time  (sec) 


100 


0     100  200   300  400  500  600 
Protein  (iiM) 

j  Determination  of  Lm 

°  8 


?^>v  Hcliaise  — * 


React -Qnem  h 


L- r- 1 


\y\/\ 


\/\/\/ 


6 
4 


o 


Quenched- flow 


Stitpped-flaw 


0       20      40      60      80  KM) 
dsDNA  length  (bp) 


Fig.  9.3  DNA  unwinding  assay  parameters,  (a)  Pre-steady  state  sample  kinetic  traces  showing 
unwinding  of  18-90  bp  length  of  fork  DNA  by  100  nM  T7  helicase  (T7  gp4A')  at  18  °C  in  the 
presence  of  2  mM  dTTP  and  excess  ssDNA  trap  using  the  gel-based  assay.  The  time  taken  for  half 
the  molecules  to  be  unwound  is  an  approximate  value  of  the  tm  for  the  reaction,  (b)  Estimation  of 
apparent  Kd:  Plot  of  fraction  of  substrate  unwound  at  increasing  concentrations  of  T7  helicase  is  fit 
to  a  hyperbola  to  obtain  the  apparent  Kd  in  the  absence  of  Mg2+.  In  this  experiment,  5  nM  fork  DNA 
was  incubated  with  various  concentrations  of  T7  helicase  in  the  presence  of  2  mM  dTTP  and  5  mM 
EDTA  for  15  min.  This  was  mixed  with  2  mM  dTTP,  3  mM  ssDNA  trap,  and  9.4  mM  MgCl2  to 
initiate  the  unwinding  reaction  which  was  quenched  after  30  s.  The  fraction  of  DNA  unwound  was 
estimated  from  native  polyacrylamide  gel.  (c)  Quick  estimation  of  Lm:  Lm  or  the  minimal  duplex 
length  refers  to  the  shortest  duplex  length  that  does  not  fall  apart  in  the  absence  of  the  helicase.  The 
x  intercept  from  a  plot  of  tm  of  unwinding  versus  DNA  length  for  the  gel-based  and  real  time  fluo- 
rescence assays  provided  Lm  of  18  and  0.64  bp,  respectively 


(unwinding)  is  a  step  associated  with  binding  but  not  binding  itself.  The  Km  of  dTTP 
ranges  from  20  to  200  \xM  depending  on  the  stability  of  the  dsDNA  [36].  Near- 
saturation  dTTP  concentrations  of  0.5  mM  in  the  stopped-flow  and  2  mM  in  the 
quenched-flow  assays  are  used  in  the  examples  discussed  here.  Helicases  can  also 
use  alternate  NTPs  as  fuel  in  addition  to  the  most  commonly  used  NTP.  For  exam- 
ple, in  addition  to  dTTP,  T7  gp4A'  can  use  dATP  and  ATP  as  substrates  with 
unwinding  being  faster  with  ATP.  However,  use  of  ATP  also  promotes  slippage  dur- 
ing unwinding  [37].  Similarly,  the  T4  bacteriophage  helicase  gp41  can  also  use  ATP 
or  GTP  for  unwinding  with  the  rates  being  faster  with  GTP  than  ATP  [38]. 
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A  mixture  of  40  bp  radiolabeled  fork  DNA  (2  nM)  and  1  mM  dTTP  incubated 
with  various  concentrations  of  T7  helicase  (5-500  nM)  is  mixed  with  Mg2+  and 
ssDNA  trap  and  quenched  after  30  s  in  a  quenched-flow  instrument.  A  plot  of  the 
fraction  of  DNA  unwound  in  30  s  versus  helicase  concentration  fits  to  a  hyperbola 
with  Km  of  21  nM  (Fig.  9.3b),  which  is  the  apparent  Kd  of  the  helicase-fork  DNA 
complex  in  the  presence  of  dTTP  without  Mg2+.  This  assay  provides  a  measure  of 
the  active  enzyme-substrate  complex  in  the  preassembly  mixture.  This  is  more  reli- 
able than  simply  using  the  Kd  obtained  from  other  equilibrium  binding  methods, 
because  it  is  possible  that  a  fraction  of  the  enzyme  forms  a  nonproductive  complex 
with  the  DNA  and  is  not  available  for  unwinding. 

The  same  information  can  be  obtained  from  the  real  time  stopped-flow  assay 
where  a  mixture  of  fluorescent-labeled  fork  DNA  (2  nM)  and  T7  helicase  (5-500  nM) 
is  incubated  with  1  mM  dTTP  in  one  syringe  of  the  stopped-flow  instrument  and 
mixed  with  Mg2+  and  ssDNA  trap  from  the  other  syringe.  The  difference  in  the  ini- 
tial and  final  plateau  values  of  fluorescence  (amplitude,  A)  increases  as  T7  helicase 
concentration  is  increased  in  the  preassembly  mixture.  The  plot  of  amplitude  versus 
T7  helicase  concentration  fits  to  a  hyperbola  with  Km  of  1 8  nM,  which  is  very  close 
to  the  value  from  the  quenched-flow  assay. 


9. 4.2    Estimation  ofL,m 

To  determine  the  unwinding  parameters,  the  actual  length  of  dsDNA  that  is  unwound 
by  the  helicase  needs  to  be  determined.  A  portion  of  the  dsDNA,  defined  as  the  Lm 
or  the  minimal  stable  dsDNA  length,  falls  apart  spontaneously  before  the  helicase 
reaches  the  end  of  the  fork  DNA  (Fig.  9.3c)  of  length  L.  The  effective  length  of 
DNA  that  the  helicase  unwinds  is  therefore  L-Lm.  The  value  of  Lm  depends  on  the 
type  of  unwinding  assay  (real  time  versus  discontinuous  gel-based  assay),  the  GC 
content  of  the  dsDNA,  and  the  temperature  of  the  experiment.  A  quick  way  to  esti- 
mate Lm  is  from  the  x  intercept  of  tm  of  unwinding  versus  L.  An  alternate  method  to 
determine  Lm  using  methyl  phosphonate-modified  DNA  has  been  used  in  the  kinetic 
analysis  of  Dda  helicase  [39]. 

Typically  the  gel-based  assay  of  unwinding  provides  a  larger  value  of  Lm  than  the 
real  time  assay.  This  is  because  the  partially  unwound  duplexes,  especially  those 
intermediates  that  have  a  very  short  stretch  of  dsDNA  remaining  to  be  unwound,  fall 
apart  in  the  time  period  between  reaction  quenching  and  product  analysis  by  gel 
electrophoresis.  Using  the  semiquantitative  method  of  tm  versus  L,  we  obtained  an 
Lm  of  18  bp  for  the  unwinding  of  18-90  bp  fork  DNA  (average  GC  content  32  %) 
by  the  gel-based  assay  (x  intercept  in  Fig.  9.3c).  The  real  time  assay,  on  the  other 
hand,  provided  Lm  of  0.64  bp  from  the  tm  of  unwinding  a  set  of  40-70  bp  (average 
GC  content  5  %)  (Fig.  9.3c).  Lm  can  be  more  accurately  obtained  from  global  fitting 
of  unwinding  a  set  of  fork  DNAs  of  different  lengths,  as  described  in  section 
"Computational  Model  for  Unwinding  (Sect.  9.5.1)." 
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9.5    Stepping  Model  of  DNA  Unwinding 

T7  helicase  can  unwind  long  stretches  of  DNA  that  exceed  the  DNA  binding  site  of  a 
single  hexamer.  Such  processive  unwinding  of  dsDNA  can  be  represented  by  a  step- 
ping model,  where  the  helicase  moves  unidirectionally  along  ssDNA  in  a  series  of 
discrete  steps  each  one  coupled  to  its  biochemical  cycle  of  NTP  hydrolysis  (Fig.  9.4). 
The  number  of  base  pairs  unwound  in  each  step  is  defined  as  the  step  size  (s)  and  the 
rate  as  the  stepping  rate  (k).  The  step  size  and  stepping  rate  are  the  most  basic  param- 
eters of  the  helicase  motor,  but  are  difficult  to  measure.  The  ensemble  all-or-none 
unwinding  assays  described  here  can  be  computationally  fit  to  a  uniform  stepping 
model  that  assumes  equal  step  size  and  stepping  rate  (Fig.  9.4a).  The  average  step  size 
obtained  from  these  measurements  is  referred  to  as  the  kinetic  step  size  as  it  provides 
the  number  of  base  pairs  unwound  between  two  rate-limiting  steps  of  unwinding  a 
stretch  of  dsDNA  and  may  not  correspond  to  the  elementary  step  size  that  is  coupled 
to  the  NTPase  cycle.  The  helicase  stepping  process  could  be  affected  by  the  composi- 
tion of  the  DNA  sequence  and  may  not  be  uniform  as  assumed  (Fig.  9.4b).  Despite  this 
limitation,  the  model  provides  reliable  values  of  the  average  single  base  pair  unwind- 
ing rates  and  processivity  of  the  helicase,  which  allows  us  to  make  inferences  on  the 
helicase  mechanism  of  action  and  effect  of  proteins  that  regulate  helicase  activity. 


9.5.1    Computational  Model  for  Unwinding 

The  uniform  stepping  model  considers  the  total  unwinding  to  consist  of  discrete 
steps  of  equal  size  (s)  and  uniform  stepping  rate  (k).  In  the  all-or-none  unwinding 
experiments  under  single  round  conditions,  we  monitor  only  the  product  of  the  last 
step  in  the  series  of  unwinding  steps,  which  is  described  by  the  incomplete  gamma 
function  (9.1).  The  mathematical  basis  for  use  of  the  gamma  function  has  been 
comprehensively  described  by  Lucius  et  al.  [40]  and  is  beyond  the  scope  of  this 
chapter.  The  incomplete  gamma  function  allows  the  number  of  unwinding  steps  (n) 
to  be  continuous  (not  only  integers)  and  hence  allows  us  to  float  the  parameter  n  in 
the  computation.  The  fitting  provides  the  stepping  rate  (k)  and  step  size  (s)  to  unwind 
the  dsDNA  of  effective  length  (L-Lm). 


(9.1) 
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Stepping  model  with  uniform  step  size(s)  and  stepping  rate  (k) 
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Fig.  9.4  Stepping  model  of  unwinding,  (a)  Uniform  stepping  model:  In  the  uniform  stepping 
model,  the  unwinding  reaction  is  considered  to  consist  of  a  series  of  discrete  steps  of  uniform  step 
size  (s)  and  stepping  rate  (k).  The  step  size  represents  the  number  of  bp  unwound  by  the  helicase 
between  two  rate-limiting  steps,  (b)  Nonuniform  stepping  model:  Effect  of  DNA  composition  or 
heterogeneity  in  helicase  population  can  result  in  varying  step  size  and  stepping  rate.  This  is 
accounted  for  in  the  nonuniform  stepping  model  which  assumes  nonuniform  stepping  rate  (ku  k2, 
k3,  kn)  and  step  size  (su  s2,  s3,  sn)  of  helicase  movement  where  n  is  the  total  number  of  steps. 
The  varying  step  size  and  stepping  rate  are  indicated  by  different  sized  arrows,  (c)  Estimation  of 
rate,  Lm,  amplitude,  and  step  size  by  global  data  fitting:  Time  course  of  unwinding  obtained  by  the 
gel-based  assay  for  fork  DNA  substrates  of  different  lengths  (18-90  bp)  by  100  nM  T7  helicase  at 
18  °C  in  the  presence  of  2  mM  dTTP  and  excess  ssDNA  trap,  globally  fit  to  a  two-population 
uniform  stepping  model  using  the  gfit  application  provided  an  average  base  pair  unwinding  rate  of 
15.5  bp/s.  (d)  Processivity:  Plot  of  amplitude  obtained  through  global  fitting  versus  effective 
duplex  length  unwound  (L-Lm)  is  fit  to  A  =  AQX  p^L~Lm^  to  obtain  a  processivity  of  0.9911  for 
the  isolated  helicase  and  0.9979  for  T7  helicase  in  the  presence  of  SSB 


is  the  normalized  incomplete  gamma  function,  n  =  (L-Lm)/s  is  the  number  of 
unwinding  steps  required  to  observe  the  product,  F(t)  is  the  fraction  of  single- 
stranded  DNA  at  time  t,  and  FQ  is  the  background  signal. 

An  inherent  problem  in  analyzing  ensemble  processes  is  the  possibility  of  multiple 
species  of  starting  enzyme-DNA  complex  or  populations  that  unwind  at  different  aver- 
age rates.  Our  model  accounts  for  this  by  calculating  the  sum  of  N  unwinding  processes, 
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providing  amplitudes  (Af)  and  stepping  rates  (Jq)  for  each  population.  A  similar  model 
developed  by  Lucius  et  al.  addresses  the  same  problem  by  incorporating  a  term  for  a 
nonproductive  enzyme-DNA  complex  and  can  fit  the  data  to  two  populations  [40].  The 
software  gfit  (MATLAB),  developed  by  Mikhail  Levin,  allows  us  to  fit  the  unwinding 
kinetic  data  to  the  model  [unwinding .  m]  to  obtain  relevant  parameters  [41]. 

To  estimate  the  minimal  stable  duplex  length,  Lm,  the  unwinding  traces  for  a  set 
of  fork  DNAs  of  different  lengths  (L)  are  fit  to  the  stepping  model  while  globally 
constraining  the  apparent  stepping  rate,  step  size,  and  Lm.  We  show  here  gel-based 
time  traces  of  unwinding  fork  DNAs  from  1 8  to  90  bp  with  32  %  average  GC  con- 
tent using  ssDNA  as  a  trap  for  which  the  global  fit  provided  Lm  of  14.8  bp,  ks  of 
18.6  steps/s,  and  s  of  0.8  bp  (Fig.  9.4c).  The  average  single  base  pair  unwinding  rate 
ku  was  calculated  as  ksxs=  18.6x0.8  =  15.5  bp/s. 

9.5.2    Processivity  ofDNA  Unwinding 

Processivity  of  single  base  pair  unwinding  (P)  is  defined  as  the  probability  that  the  heli- 
case  unwinds  a  base  pair  as  opposed  to  dissociating  from  that  position  on  the  DNA  and 
it  estimates  how  far  the  helicase  moves  on  the  DNA  before  dissociating  from  the  DNA. 
Processivity  is  determined  from  the  amplitudes  of  DNA  unwinding  traces  of  fork  DNAs 
of  different  lengths.  The  amplitude  can  be  determined  from  computational  fitting  or 
simple  examination  of  the  unwinding  time  traces  in  single  round  experiments.  For 
example,  unwinding  of  fork  DNAs  from  18  to  90  bp  fork  DNA  under  single  round  con- 
ditions shows  a  decrease  in  amplitude  with  increasing  dsDNA  length.  Fitting  the  plot  of 
amplitude  (A)  versus  (L-L^)  to  the  relationship,  A  =  A0x  p(L~L™>  [40],  provides  pro- 
cessivity of  0.991 1  for  T7  helicase  (Fig.  9.4d).  This  is  a  more  accurate  value  obtained  by 
fitting  the  unwinding  data  to  the  two-population  stepping  model  compared  to  the  previ- 
ously published  value  of  0.9835  obtained  by  fitting  the  data  to  a  one-population  stepping 
model  [42].  The  P  of  0.9911  indicates  that  T7  helicase  unwinds  on  an  average  about 
1 12  bp  of  DNA  before  dissociating  based  on  the  relation,  average  distance  travelled  =  1/ 
(1  -P)  [40].  The  single  base  pair  processivity  P  is  equal  to  kjku+kd,  where  ku  is  the  aver- 
age single  base  pair  unwinding  rate  and  kd  is  the  average  helicase  dissociation  rate  after 
each  base  pair  unwinding.  From  the  processivity  data  and  average  single  bp  unwinding 
rate,  we  determine  that  T7  helicase  dissociates  with  an  average  rate  of  0.14  s_1  during 
unwinding.  The  dissociation  rate  of  T7  helicase  during  unwinding  also  increases  with 
increase  in  GC  content  of  the  DNA.  This  could  possibly  be  the  effect  of  enhanced  stall- 
ing observed  for  the  helicase  on  GC  patches  in  single  molecule  experiments.  Contrary 
to  expectations  for  a  ring-shaped  helicase  that  binds  DNA  in  the  central  channel,  the 
unwinding  studies  reveal  a  surprisingly  low  processivity  for  the  T7  helicase  compared 
to  its  processivity  during  ssDNA  translocation  (112  versus  7,500  bp).  The  low  proces- 
sivity is  probably  a  consequence  of  interactions  of  the  helicase  with  the  fork  junction. 
We  show  below  that  addition  of  E.  coli  SSB  or  DNA  polymerase  increases  the  proces- 
sivity of  the  complex  during  unwinding. 
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9.5.3   Active  Versus  Passive  Mechanism  ofDNA  Unwinding 

Helicases  can  unwind  dsDNA  by  an  active  or  passive  mechanism.  A  passive  heli- 
case located  at  the  fork  junction  does  not  destabilize  the  junction  base  pairs  and  can 
advance  only  when  the  junction  base  pairs  are  separated  by  thermal  fluctuation.  On 
the  other  hand,  if  the  presence  of  the  helicase  or  the  forward  motion  of  the  helicase 
powered  by  the  NTP  hydrolysis  reaction  destabilizes  the  junction  base  pairs  and 
shifts  the  equilibrium  of  base  pair  opening  and  closing  toward  opening,  then  the 
helicase  is  unwinding  by  an  active  mechanism  (Fig.  9.5a). 

One  way  to  assess  whether  a  helicase  unwinds  by  an  active  or  passive  mecha- 
nism is  to  determine  the  sensitivity  of  helicase-catalyzed  unwinding  to  base  pair 
stability.  A  mathematical  treatment  by  Betterton  and  Julicher  [23]  has  shown  that  a 
helicase  may  show  different  sensitivities  to  base  pair  stability  depending  on  whether 
the  helicase  uses  an  active  or  a  passive  mechanism  of  unwinding.  If  the  helicase 
works  by  a  passive  mechanism  mostly  by  trapping  the  thermally  frayed  DNA  ends, 
then  its  unwinding  rate  will  be  influenced  maximally  by  base  pair  stability.  If  the 
helicase  interacts  with  the  ss/ds  junction  or  its  conformational  transitions  entail 
motions  aiming  at  base  pair  melting,  then  the  unwinding  rate  will  not  be  dramati- 
cally influenced  by  base  pair  stability.  The  degree  of  helicase 's  active  involvement 
in  the  unwinding  mechanism  depends  on  the  extent  to  which  the  helicase  shifts  the 
equilibrium  to  base  pair  opening,  which  in  turn  depends  on  the  nature  of  the  heli- 
case's  interaction  with  the  fork  junction  and  the  coupling  of  NTP  binding  and 
hydrolysis  to  ssDNA  translocation. 

Substrates  with  different  stabilities  can  be  easily  prepared  by  introducing  differ- 
ent percentages  of  GC  base  pairs  in  the  dsDNA.  In  the  example  shown  here,  we 
introduced  0-100  %  GC  content  by  uniformly  distributing  the  GC  base  pairs  in  a 
40-bp  duplex  DNA  region.  The  average  bp  stability  of  the  DNA  substrates  was  cal- 
culated using  the  nearest  neighbor  analysis  [43]  and  it  ranges  from  0.96  to  2. 13  kcal/ 
mol/bp.  The  gel-based  unwinding  traces  show  an  increase  in  lag  time  with  increas- 
ing GC  content,  indicating  that  the  DNAs  with  the  greater  number  of  GC  base  pairs 
are  unwound  more  slowly.  The  unwinding  kinetics  were  fit  to  the  uniform  stepping 
model  to  determine  the  average  single  base  pair  unwinding  rate  (&u),  which  decreases 
from  30  to  3  bp/s  (Fig.  9.5b). 

To  quantitatively  fit  the  above  data  and  determine  if  T7  helicase  unwinds  by  a 
passive  or  active  mechanism,  we  used  the  basic  ideas  developed  by  Betterton  and 
Julicher  [23]  for  a  motor  moving  against  a  mobile  obstacle.  The  opening  and  closing 
of  the  junction  base  pair  is  treated  as  a  rapid  equilibrium  process  and  described  by 
the  free  energy  or  AG  of  base  pair  opening-closing.  We  assume  that  the  helicase  is 
located  on  the  fork  DNA  next  to  the  junction  base  pair  and  the  interaction  energy  of 
the  helicase  with  the  junction  base  pair  is  UQ.  We  also  assume  that  the  helicase  trans- 
locates with  a  step  size  of  one  nucleotide,  and  in  this  case  the  unwinding  rate  (ku)  of 
the  helicase  with  respect  to  its  ssDNA  translocation  rate  (kss)  is  described  by  the 
following  equation: 
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Passive  Mechanism  Active  Mechanism 


Time  (sec)  Base  pair  stability  (AG/bp) 

Fig.  9.5  Mechanism  of  unwinding:  (a)  Illustration  of  active  versus  passive  mechanism  of  unwind- 
ing: In  the  passive  mechanism  of  unwinding,  base  pair  separation  occurs  by  thermal  fluctuation  of 
junction  base  pairs  and  when  the  ssDNA  available  is  greater  than  or  equal  to  the  helicase  step  size, 
the  helicase  may  move  forward.  In  the  active  mechanism,  the  presence  of  the  helicase  at  the  junc- 
tion or  its  NTP  hydrolysis-coupled  forward  motion  destabilizes  the  junction  base  pairs  (repre- 
sented by  the  blue  cloud)  and  facilitates  DNA  melting,  (b)  DNA  stability-dependent  unwinding: 
Time  course  of  unwinding  obtained  by  the  gel-based  assay  for  DNA  substrates  of  different  stabili- 
ties generated  by  varying  the  GC  content  of  the  DNA.  The  reactions  were  carried  out  at  18  °C  by 
100  nM  T7  helicase  on  2.5  nM  DNA  in  the  presence  of  2  mM  dTTP,  2  mM  free  Mg2+,  and  excess 
ssDNA  trap.  The  kinetic  traces  show  a  decrease  in  amplitude  and  rate  of  unwinding  with  increasing 
GC  content,  (c)  Interaction  energy  of  the  helicase  with  junction  base  pairs:  A  plot  of  single  bp 
unwinding  rates  obtained  for  DNA  substrates  with  different  GC  content  by  the  gel-based  assay  as 
a  function  of  the  average  single  bp  stability  fit  to  (9.2)  to  determine  the  interaction  energy  exerted 
by  the  helicase  on  the  junction  base  pair  (U0),  keeping  /=  0.05.  The  equation  assumes  that  the 
helicase  is  at  the  junction  and  translocates  with  a  step  size  of  1  bp.  The  presence  of  SSB  increases 
the  interaction  energy  U0,  which  could  be  due  to  the  effect  exerted  by  the  SSB  directly  on  the  DNA 
or  indirectly  through  the  helicase 


k     Cx(c  +  (1-C)xe-/X((,°/RT)  ) 
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where  R  is  the  gas  constant,  Tis  temperature  in  degree  Kelvin,  and/is  a  dimension- 
less  coefficient  whose  value  lies  between  0  and  1.  It  describes  the  relative  effect  of 
the  interaction  energy  on  the  helicase' s  forward  versus  the  backward  rate  or  junc- 
tions opening  versus  closing  rates.  The  exact  value  off  is  unknown  and  it  was  fixed 
to  a  small  value  of  0.05  as  treated  by  Betterton  and  Julicher  [23].  The  ssDNA  trans- 
location rate  (kss)  of  130  nt/s  was  determined  from  independent  experiments  under 
the  same  solution  conditions  and  dTTP  concentrations  as  the  unwinding  reactions. 

The  plot  of  kjkss  for  T7  helicase  unwinding  fork  DNAs  of  different  GC  content 
decreases  with  increasing  average  AG  of  base  pairing  (Fig.  9.5c).  The  fit  to  9.2 
provides  interaction  energy  U0  of  0.36  kcal/mol  indicating  that  T7  helicase  unwinds 
by  an  active  mechanism.  However,  the  small  value  of  U0  indicates  that  it  is  not  an 
optimally  active  helicase  as  was  also  observed  by  single  molecule  methods  [44] .  We 
expect  that  an  optimally  active  helicase  would  unwind  dsDNA  as  fast  as  it  translo- 
cates on  ssDNA  and  such  a  helicase  would  have  interaction  energy  close  to  2.2  kcal/ 
mol.  Similar  conclusions  have  been  made  for  other  ring-shaped  helicases  such  as 
E.  coli  DnaB  and  bacteriophage  T4  helicase  [45,  46].  It  has  been  observed  that  the 
efficiency  of  a  helicase  greatly  increases  in  the  presence  of  other  proteins  such  as  the 
single  strand  binding  (SSB)  protein,  DNA  polymerase,  and  primase  enzymes.  We 
describe  examples  to  show  the  effect  of  these  proteins  on  the  helicase  activity  and 
interpretations  of  how  they  modulate  the  mechanism  of  helicase  action. 


9.6    Effect  of  the  SSB  on  DNA  Unwinding 

SSB  proteins  have  been  shown  to  stimulate  DNA  unwinding  by  many  helicases 
[47].  The  SSB  protein  binds  to  ssDNA  intermediates  and  mediates  steps  in  DNA 
replication  and  repair,  although  the  exact  mechanism  is  not  well  understood.  E.  coli 
SSB  is  a  heterologous  ssDNA  binding  protein  for  T7  helicase,  but  it  is  present  in  the 
cell  during  infection  by  phage  T7  and  it  is  a  well  characterized  ssDNA  binding  pro- 
tein [48].  To  understand  the  effect  of  E.  coli  SSB,  the  unwinding  parameters  from 
the  gel-based  and  real  time  unwinding  assays  by  T7  helicase  in  the  presence  of  SSB 
are  compared  to  those  obtained  in  its  absence. 

In  the  presence  of  E.  coli  SSB,  the  average  unwinding  rate  per  base  pair  is  ~ 
twice  the  average  unwinding  rate  in  its  absence  using  the  same  DNA  substrates.  The 
unwinding  traces  for  the  18-90  bp  DNA  (32  %  GC  content)  obtained  from  the  gel- 
based  assays  fit  to  the  uniform  stepping  model  as  described  above,  provided  glob- 
ally constrained  ks  of  24  steps/s,  s  of  1.53  bp,  and  Lm  of  10.4  bp  in  the  presence  of 
SSB  as  compared  to  18.6  steps/s,  0.8  bp,  and  14.8  bp  in  the  absence  of  SSB.  Thus, 
the  rate  of  unwinding  by  T7  helicase  increases  in  the  presence  of  SSB  from  15.5  to 
36.6  bp/s.  A  sample  unwinding  curve  to  show  the  effect  of  SSB  on  a  60  %  GC  DNA 
is  shown  in  Fig.  9.6a  and  Table  9.1. 

The  processivity  of  the  helicase  also  increases  in  the  presence  of  SSB  (0.9979) 
indicating  that  T7  helicase  unwinds  on  an  average  about  476  bp  of  dsDNA  (Fig.  9.4d) 
(compared  to  112  bp  dsDNA)  before  dissociating  and  the  average  rate  of  helicase 
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Fig.  9.6  Effect  of  various  replisome  proteins  on  DNA  unwinding  by  T7  helicase:  (a)  Effect  of  E.  coli 
SSB  on  DNA  unwinding:  Time  course  of  unwinding  a  60  %  GC  DNA  shows  increase  in  rate  and 
amplitude  in  the  presence  of  SSB  (1  |xM).  DNA  unwinding  was  measured  by  the  gel-based  assay  at 
18  °C  by  100  nM  gp4A'  on  2.5  nM  DNA  in  the  presence  of  2  mM  dTTP  and  2  mM  free  Mg2+.  (b) 
Effect  of  the  priming  site  on  DNA  unwinding:  Comparison  of  the  time  course  of  gel-based  unwinding 
reactions  shows  that  the  presence  of  the  T7  priming  site  slows  down  T7  helicase  both  in  the  presence 
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dissociation  decreases  to  0.077  s_1.  The  interaction  energy  of  the  helicase  on  the 
junction  base  pair  to  destabilize  it,  obtained  by  plotting  ku/kss  versus  AG/bp  of  the 
DNA  in  the  presence  of  SSB,  increased  about  2  times  in  the  presence  of  SSB 
(Fig.  9.5c).  This  increase  in  UQ  could  be  due  to  a  direct  effect  of  the  SSB  on  the  ss/ 
ds  junction  or  an  indirect  effect  through  the  helicase. 


9.7    Effect  of  the  Primase  on  DNA  Unwinding 

The  primase  provides  the  RNA  primers  required  to  initiate  DNA  replication  on  the 
lagging  strand.  The  primase  protein  is  closely  associated  with  the  replicative  heli- 
case and,  in  case  of  bacteriophage  T7,  the  primase  is  part  of  the  same  polypeptide. 
The  N-terminal  half  of  T7  helicase  contains  the  primase  domain  that  recognizes 
3'CTGG/TG/T  sequence  on  ssDNA  [49]  and  uses  this  priming  sequence  as  a  tem- 
plate to  make  short  RNA  primers  when  ATP  +  CTP  are  present.  In  the  hexameric 
structure,  the  primase  domains  are  positioned  behind  the  helicase  domains  and  the 
distance  between  the  helicase  and  primase  active  sites  is  about  10  nt  [50].  The  pri- 
mase synthesizes  RNA  primers  in  the  opposite  direction  (3 '-5')  relative  to  the 
motion  of  the  helicase  domain  (5 '-3').  This  raises  the  question  as  to  how  the  pri- 
mase affects  the  unwinding  activity  of  the  helicase. 

To  investigate  if  the  DNA  unwinding  activity  of  T7  helicase  is  influenced  by  the 
priming  activity,  we  made  two  fork  substrates:  one  contained  a  single  priming 
sequence  (3'CTGGG)  22  bp  downstream  from  the  fork  junction,  and  a  control  DNA 
without  the  priming  sequence.  DNA  unwinding  by  T7  helicase  was  measured  using 
the  all-or-none  gel-based  assay  under  single  round  conditions.  The  lag  kinetics  were 
computationally  fit  to  the  uniform  stepping  model,  which  show  that  the  priming  sub- 
strate is  unwound  with  an  average  single  base  pair  unwinding  rate  of  15  bp/s  whereas 
the  control  substrate  is  unwound  with  a  rate  of  47  bp/s  (Fig.  9.6b).  The  experiments 
show  that  the  priming  activity  slows  the  helicase  [51].  We  observed  helicase  slowing 
even  in  the  absence  of  ATP+CTP,  which  indicates  that  primer  synthesis  is  not 
required  and  simply  binding  of  the  primase  domains  to  the  priming  site  is  sufficient 
to  slow  the  helicase.  Comparable  results  were  obtained  with  the  real  time  fluores- 
cence assay  when  performed  at  varying  dTTP  (energy  source  of  the  helicase)  con- 
centrations as  well.  Thus,  the  effect  of  primase  on  helicase  slowing  is  an  allosteric 
effect  in  which  the  primase  engaged  with  the  priming  site  opposes  the  helicase's 


<  

Fig.  9.6  (continued)  and  absence  of  priming  NTPs.  (c)  Effect  of  T7  DNA  polymerase  on  DNA 
unwinding:  Time  course  of  unwinding  in  the  presence  and  absence  of  T7  DNA  polymerase  at  increas- 
ing dNTP  concentrations  obtained  by  the  gel-based  assay  shows  that  the  presence  of  an  active  DNA 
polymerase  stimulates  DNA  unwinding  and  at  saturating  dNTP  concentrations  the  unwinding  rates 
approach  the  helicase  translocation  rate  on  ssDNA.  The  reactions  were  carried  out  at  18  °C  with 
400  nM  T7  helicase,  400  nM  T7  DNA  polymerase  (T7  gp5:thioredoxin  complex),  200  nM  fork  DNA 
with  primer,  2  mM  dTTP,  and  varying  concentrations  of  dNTP  (5,  10,  40,  and  100  uM) 
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Table  9.1  Effect  of  replisomal  proteins  on  the  average  single  base  pair  unwinding  rate  and 
processivity  of  T7  gp4A/a 

Effect  of  SSB  Effect  of  primase       Effect  of  polymerase 

-SSB   +SSB   -Primase  +Primase  -Polymerase  +Polymerase 

GC  content  (%)  32  32  20  20  33  33 

Average  single  bp  -15  -37  -47  -15  -9  -114 

unwinding  rateb  (bp/s) 

Processivity0  (bp)  112  476  - 

Interaction  energy  (U0)d  0.35  0.74  -  -  -  - 

aThe  differences  observed  across  experiments  with  only  helicase  is  an  effect  of  GC%  and  other 
experimental  conditions 

bAverage  single  bp  unwinding  rate  =  stepping  rate  x  step  size 

cProcessivity:  number  of  bp  unwound  per  binding  event 

interaction  energy:  Measure  of  influence  of  helicase  on  junction  base  pair 


forward  movements.  Interestingly,  helicase  slowing  in  the  presence  of  the  priming 
site  was  not  observed  when  T7  helicase  is  coupled  with  T7  DNA  polymerase  [51]. 

Contrary  to  slowing  down  of  the  T7  helicase  in  the  presence  of  priming  sites,  single 
molecule  experiments  using  magnetic  tweezers  in  the  T4  system  in  which  the  primase 
enzyme  (gp61)  is  a  separate  polypeptide  has  shown  that  the  presence  of  the  gp61  in  the 
absence  of  NTPs  does  not  modify  the  helicase  unwinding  activity  on  a  DNA  containing 
priming  sites  [38].  The  mechanism  of  continued  replication  fork  progression  during 
primer  synthesis  appears  to  occur  through  the  formation  of  a  priming  loop  (T7  and  T4) 
[38,  51],  dissociation  of  one  of  the  primase  subunits  from  the  primosome  complex 
(E.  coli)  [52],  or  pausing  of  the  leading  strand  synthesis  (T7)  [53]. 


9.8    Effect  of  the  DNA  Polymerase  on  DNA  Unwinding 


Leading  strand  DNA  synthesis  is  catalyzed  by  the  synergistic  activity  of  T7  helicase 
and  T7  DNA  polymerase.  To  investigate  how  T7  DNA  polymerase  affects  the 
unwinding  rate  of  T7  helicase,  we  prepared  a  fork  DNA  that  contained  a  primer- 
template  in  place  of  the  3 '-overhang  where  the  DNA  polymerase  could  bind.  The 
unwinding  kinetics  were  measured  using  the  all-or-none  gel-based  assay.  The  heli- 
case and  polymerase  were  preassembled  on  the  replication  fork  DNA  in  the  pres- 
ence of  dTTP  and  reactions  were  initiated  with  Mg2+  and  the  remaining  dNTPs.  The 
lag  kinetics  were  computationally  fit  to  the  stepping  model  which  shows  that  the 
isolated  helicase  unwinds  the  replication  fork  DNA  with  an  average  single  base  pair 
unwinding  rate  of  9  bp/s  (30  %  GC  content),  but  in  the  presence  of  T7  DNA  poly- 
merase with  saturating  dNTPs  this  rate  increased  to  -120  bp/s  close  to  the  ssDNA 
translocation  rate  (Fig.  9.6c).  The  rate  stimulation  depended  on  the  concentration  of 
dNTPs  that  control  the  rate  of  DNA  synthesis.  As  the  concentration  of  the  3  dNTPs 
was  increased  from  5,  10,  40,  and  100  uM,  the  average  single  base  pair  unwinding 
rate  increased  from  45,  65,  116,  and  117  bp/s,  respectively.  This  shows  that  when 
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the  DNA  polymerase  catalyzes  DNA  synthesis  at  its  fastest  rate  at  saturating  dNTPs, 
it  is  able  to  stimulate  the  helicase  activity  to  maximum  resembling  the  helicase 's 
speed  of  ssDNA  translocation  (130  nt/s).  A  heterologous  polymerase  such  as  T4 
DNA  polymerase  does  not  cause  similarly  high  stimulation  [54].  Thus,  both  physi- 
cal and  functional  couplings  are  necessary  for  effective  leading  strand  DNA  synthe- 
sis by  the  helicase-polymerase. 

The  stimulation  on  the  leading  strand  unwinding  synthesis  could  be  a  result  of 
DNA  polymerase  providing  a  "push"  for  the  helicase  forward  motion  or  acting  as  a 
"brake"  to  prevent  slippage  as  suggested  for  the  T7  system  [54].  Recent  single  mol- 
ecule studies  on  the  T4  system  indicate  that  the  stimulation  could  be  because  the 
helicase  allows  the  polymerase  to  adopt  a  polymerizing  conformation  versus  an 
exonucleolytic  conformation  and  the  polymerase  aids  the  helicase  by  unwinding  the 
first  few  base  pairs  and  thereby  increasing  the  unwinding  rate  [55]. 


9.9  Conclusions 

Obtaining  accurate  kinetic  parameters  that  define  the  unwinding  activity  of  heli- 
cases  is  critical  to  understanding  both  the  mechanism  and  role  of  helicases  in  bio- 
logical processes.  The  ensemble  approaches,  especially  the  pre-steady  state  kinetic 
methods,  provide  a  facile  way  to  obtain  basic  helicase  parameters  that  can  be  used 
as  handles  to  understand  regulation  of  helicase  by  associated  proteins  such  as  DNA 
polymerase  and  single  strand  binding  protein  in  DNA  replication. 

The  ensemble  methods  measure  averaged  behavior  of  helicase  molecules,  which 
is  an  inherent  limitation  that  precludes  precise  measurements  of  unsynchronized  or 
heterogeneous  populations.  This  limitation  can  be  overcome  by  single  molecule 
methods  that  track  individual  helicase  molecules  and  can  resolve  behaviors  of  heli- 
case populations.  The  single  molecule  methods  are  particularly  powerful  in  observ- 
ing the  stepwise  translocation  of  processive  helicases.  Such  studies  have  helped  in 
observing  previously  unobservable  stalling/pausing  behavior  of  individual  helicase 
molecules  during  DNA  unwinding  [56]. 

All  three  types  of  single  molecule  methods,  optical  trap,  magnetic  tweezers,  and 
fluorescence  resonance  energy  transfer  (FRET),  have  been  used  to  measure  the 
unwinding  reaction  by  ring-shaped  replicative  helicases  including  T7  gp4A',  T4 
bacteriophage  gp41,  and  E.  coli  DnaB.  The  optical  trap  and  magnetic  tweezers 
methods  apply  picoNewton  forces  and  measure  nanometer  distance  changes  during 
DNA  unwinding.  All  of  these  helicases  show  faster  unwinding  rates  with  applica- 
tion of  increasing  destabilizing  force  on  the  fork  DNA,  and  the  force-velocity  mea- 
surements are  in  agreement  with  the  ensemble  studies  and  indicate  that  these 
ring-shaped  helicases  do  not  function  by  an  entirely  active  mechanism  [44,  45,  55]. 
While  the  single  molecule  methods  provide  intricate  details  of  helicase  transloca- 
tion, the  stretching  force  on  the  DNA  itself  may  affect  the  helicase  mechanism. 
Experiments  with  DnaB  helicase  have  shown  that  the  geometry  of  the  fork  DNA 
and  whether  the  force  is  applied  to  the  strand  occluded  by  the  DnaB  ring  or  to  the 
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strand  encircled  by  DnaB  dictate  the  degree  of  activeness  [46].  Similarly,  separation 
of  the  two  strands  behind  the  translocating  helicase  under  force  could  also  modify 
the  helicase  mechanism  if  the  excluded  strand  is  involved  in  some  aspect  of  unwind- 
ing as  has  been  proposed  for  the  eukaryotic  MCM  2-7  protein  [57]. 

Single  molecule  FRET  provides  an  alternative  way  to  measure  nanometer  distance 
changes  during  DNA  unwinding  without  application  of  force.  These  experiments 
have  been  used  to  understand  aspects  of  helicase  unwinding  such  as  enzyme  confor- 
mational changes  and  interaction  with  other  proteins  in  the  replisome  [5].  Such  exper- 
iments showed  slowing  of  T7  helicase  by  priming  site  on  fork  DNA  and  the  formation 
of  a  priming  loop  when  T7  helicase  is  coupled  to  the  DNA  polymerase  during  leading 
strand  DNA  replication  [51].  The  application  of  methods  that  combines  FRET  and 
force-based  manipulation  [58,  59]  will  be  helpful  in  tracking  the  relative  positioning 
of  proteins  on  the  replication  fork  and  simultaneously  monitoring  the  unwinding  and 
replication  process  with  application  of  force  and  distance  measurements. 
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Chapter  10 

Rotary  Motor  ATPases 


Stephan  Wilkens 


Abstract  The  preceding  chapters  offered  an  introduction  to  a  selection  of  biophysical 
tools  most  commonly  used  in  the  elucidation  of  the  structure  and  mechanism  of 
biological  macromolecules.  The  following  chapter  describes  how  some  of  these 
tools  were  applied — and  new  ones  developed — to  gain  an  understanding  of  the  cata- 
lytic mechanism  of  a  particular  class  of  membrane-bound  transport  proteins,  the 
rotary  motor  ATPases.  Rotary  motor  ATPases  are  highly  efficient  molecular 
machines  that  function  to  interconvert  chemical  energy  (in  form  of  ATP)  into  the 
potential  energy  of  transmembrane  ion  motive  force,  and  vice  versa.  Energy  conver- 
sion in  the  rotary  ATPases  involves  rotation  of  a  central  subdomain  (the  rotor)  rela- 
tive to  a  static  portion  called  the  stator.  Most  rotary  motor  ATPases  can  function  in 
both  directions,  which  means  that  the  enzyme  can  either  pump  ions  across  lipid 
membranes  at  the  expense  of  ATP  hydrolysis  or  synthesize  ATP  driven  by  ion  flow 
along  a  concentration  gradient  through  the  membrane-bound  part  of  the  complex. 
Catalysis  involving  subunit  rotation  was  proposed  before  detailed  structural  infor- 
mation was  available;  however,  proving  the  existence  of  a  rotary  mechanism  turned 
out  to  be  a  biophysical  challenge,  and,  along  the  way,  novel  single  molecule  obser- 
vation techniques  had  to  be  developed  to  be  able  to  confirm  the  rotary  motor  hypoth- 
esis. Rotary  motor  ATPases  can  be  found  in  all  domains  of  life  including  bacteria, 
archaea,  and  eukarya.  The  enzyme  found  in  the  inner  membrane  of  mitochondria, 
the  plasma  membrane  of  bacteria,  and  the  thylakoid  membrane  of  chloroplasts  is 
called  FiFo-ATP  synthase  or  F-ATPase.  The  enzyme  found  in  archaea  is  called 
A- ATP  synthase  (or  Ai A0-ATP  synthase  or  A-ATPase)  and  the  enzyme  found  in  the 
endomembrane  system  (and  sometimes  plasma  membrane)  of  eukaryotic  organisms 
is  called  vacuolar  ATPase  (or  V^- ATPase  or  V-ATPase).  The  family  of  the  rotary 
ATPases  is  characterized  by  a  similar  overall  topology,  a  cytoplasmic  ATPase 
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connected  to  a  membrane-bound  ion  channel,  with  differences  in  subunit  composition 
and  structure  that  have  evolved  to  accommodate  different  functional  needs  and 
mechanisms  of  enzyme  regulation. 

Keywords  ATPase  •  Rotary  molecular  motor  •  F-ATP  synthase  •  A- ATP  synthase 
•  Vacuolar  ATPase  •  Biophysics  •  X-ray  crystallography  •  Electron  microscopy  • 
NMR  spectroscopy  •  Small  angle  X-ray  scattering  •  Single  molecule  observation  • 
Single  molecule  fluorescence  resonance  energy  transfer  (FRET)  spectroscopy 


10.1    The  Family  of  Rotary  Motor  ATPases 

The  family  of  rotary  motor  ATPases  is  divided  into  three  main  subtypes  that  are 
believed  to  have  originated  from  a  common  ancestor  parallel  to  the  divergence  of 
the  three  kingdoms  of  life  [1-4].  The  enzymes  found  in  the  plasma  membrane  of 
archaea  and  bacteria  are  called  A- ATPase  and  F- ATPase,  respectively.  Eukaryotic 
organisms  harbor  both  an  F-type  ATPase  (in  the  inner  mitochondrial  membrane  and 
the  thylakoid  membrane  of  chloroplasts)  and  a  complex  structurally  more  similar  to 
the  A- ATPase  that  is  called  vacuolar  ATPase  (V- ATPase).  The  vacuolar  ATPase  is 
found  in  the  endomembrane  system  of  all  eukaryotic  cells  but  also  in  the  plasma 
membrane  of  some  specialized  cell  types  in  higher  animals  [5-8].  The  presence  of 
both  F-  and  V-type  ATPases  in  eukaryotic  cells  is  consistent  with  the  idea  that  mod- 
ern eukaryotes  have  evolved  from  proto-eukaryotes  (an  early  branch  of  the  archaea) 
that  formed  a  symbiotic  relationship  with  a  proteobacteria,  the  precursors  for  the 
modern  mitochondria  [9] . 

A  rotary  motor  ATPase  structurally  similar  to  the  A- ATPase  has  also  been  found 
in  some,  mostly  extremophilic,  bacteria  and  it  is  believed  that  the  genes  for  the 
archaeal-like  enzyme  in  these  organisms  have  been  acquired  by  horizontal  gene 
transfer  from  archaea  sharing  the  same  habitat  [10].  This  A-  or  A/V-like  ATPase 
likely  functions  as  ATP  synthase,  either  replacing  or  duplicating  the  function  of  the 
F- ATPase  in  these  bacteria  [11,  12].  From  here  on  in  the  chapter,  vacuolar  ATPase 
(V- ATPase)  will  be  used  in  the  context  of  the  eukaryotic  enzyme  whereas  the  bacte- 
rial complex  will  be  referred  to  as  A/V-like  ATPase  as  suggested  in  [13].  One  of  the 
structurally  and  functionally  best  characterized  members  of  the  bacterial  A/V-like 
ATPases  is  the  enzyme  from  the  thermophilic  bacterium,  Thermus  thermophilus 
[14].  Thanks  to  its  thermostability,  intact  T.  thermophilus  A/V- ATPase  and  the 
enzyme's  functional  domains  and  individual  subunits  have  been  purified  and  their 
structure  and  stoichiometry  analyzed  by  electron  microscopy,  X-ray  crystallogra- 
phy, NMR  spectroscopy,  and  native  mass  spectrometry  (MS)  (see  below).  A  sche- 
matic of  the  bacterial  F-,  the  archaeal  A-,  and  the  eukaryotic  V- ATPase  is  shown  in 
Fig.  10.1.  The  first  rotary  ATPase  described  was  F- ATPase,  also  referred  to  as  FxF0- 
ATP  synthase  with  Fx  representing  "Factor  1,"  the  water-soluble  ATP  hydrolyzing 
factor  that  can  be  released  from  the,  e.g.,  inner  mitochondrial  membrane,  and  FQ  for 
the  insoluble,  membrane-bound  Factor  "o"  (with  the  subscript  "o"  referring  to  the 
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F-ATPase  A-ATPase  V-ATPase 

Fig.  10.1  Structural  models  of  bacterial  F-,  archaeal  A-,  and  eukaryotic  V-ATPase.  Models  of  the 
subunit  arrangement  in  the  bacterial  F-ATPase  (a),  archaeal  A-ATPase  (b),  and  eukaryotic 
V-ATPase  (c).  While  F-  and  A-ATPase  can  function  as  both  ATP  synthase  and  ATP  hydrolysis- 
driven  ion  pumps,  eukaryotic  V-ATPase  is  a  dedicated  proton  pump.  As  can  be  seen,  F-ATPase  has 
one,  while  A-ATPase  and  V-ATPase  have  two  and  three  stator  stalks,  respectively.  Catalytic  sub- 
units  are  in  blue,  the  rotor  in  green,  and  the  stator  stalk(s)  or  stator(s)  in  red/orange 


macrolide  oligomycin,  a  potent  inhibitor  of  mitochondrial  F-ATP  synthase)  [15, 
16].  For  consistency,  archaeal  A-  and  eukaryotic  V-ATPase  are  therefore  also 
referred  to  as  A^-ATP  synthase  and  ViV0-ATPase,  respectively,  even  though  oli- 
gomycin is  not  an  inhibitor  of  these  two  enzymes. 

The  A-,  V-,  and  F- ATPases  function  as  the  smallest  rotary  molecular  motors 
described  so  far,  and,  depending  on  enzyme  source,  speed  of  subunit  rotation  can  be 
>100  rps,  thereby  catalyzing  synthesis  or  hydrolysis  of  ATP  at  rates  of  up  to  500  s-1 
(for  the  chloroplast  ATP  synthase  [17]).  Unequivocal  proof  that  multi-site  catalysis 
is  accompanied  by  subunit  rotation,  first  postulated  for  the  F-ATPase  from  animal 
mitochondria  [18],  required  development  of  sophisticated  single  molecule  observa- 
tion techniques  that  have  since  then  provided  a  wealth  of  information  on  the  molec- 
ular mechanism  of  subunit  rotation  and  the  movement  of  domains  of  catalytic  and 
regulatory  subunits  during  steady  state  catalysis.  The  functional  data  together  with 
progress  of  atomic  and  near-atomic  resolution  structure  determination  by  X-ray 
crystallography,  NMR  spectroscopy,  and  cryo  electron  microscopy  makes  the  rotary 
ATPases  one  of  the  best  characterized  families  of  molecular  machines. 


10.2    Function  of  the  Rotary  Motor  ATPases 

Rotary  molecular  motor  ATPases  are  energy  converters  that  couple  the  free  energy 
change  associated  with  the  synthesis  or  hydrolysis  of  MgATP  to  the  release  or  stor- 
age of  potential  energy  in  form  of  a  transmembrane  electrochemical  potential. 
According  to  Mitchell's  chemiosmotic  hypothesis  ([19];  for  an  in-depth  treatment, 
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see  [20]),  the  potential  energy  (difference  in  free  energy,  AG)  of  the  electrochemical 
ion  gradient  can  be  described  as  "ion  motive  force"  (imf),  which  is  given  to: 

imf  (mV)  =  -  ( A  H+  j  •  F~l  =  A(p  —  (2.3  ■  RT)  ■  F~l  ■  ApH  =  Acp-59- ApH 

with  AjllH+  being  the  difference  in  electrochemical  potential,  F  the  Faraday  con- 
stant (F),  Av|/  the  transmembrane  electrical  potential  difference,  and  ApH  (or  ApNa 
in  case  of  sodium  ion-coupled  ATPases)  the  difference  in  ion  concentration. 

One  of  the  structurally  and  functionally  best  characterized  rotary  ATPases  is  the 
F-ATPase  (FiFQ-  ATP  synthase)  found  in  the  plasma  membrane  of  bacteria,  the  mito- 
chondrial inner  membrane,  and  the  thylakoid  membrane  of  chloroplasts  [18,  21- 
24].  F-ATPase  can  function  as  a  proton  (or  sodium)  gradient-driven  ATP  synthase, 
harnessing  the  potential  energy  of  the  ion  motive  force  across,  e.g.,  the  bacterial 
plasma  or  inner  mitochondrial  membrane  to  synthesize  ATP  from  ADP  and  inor- 
ganic phosphate,  or  the  enzyme  can  hydrolyze  ATP  to  pump  protons  or  sodium  ions 
across  a  lipid  bilayer.  When  growing  under  anaerobic  conditions,  certain  bacteria 
(such  as  Escherichia  coli)  use  F-ATPase  in  the  reverse  direction,  that  is,  as  an  ATP 
hydrolysis-driven  proton  pump,  to  establish  a  proton  motive  force  across  the  plasma 
membrane  that  serves  to  drive  subsequent  secondary  transport  processes  for  the 
import  or  export  of  nutrients  and  metabolites,  respectively. 

ATP  synthesis  by  mitochondrial  F-ATPase,  which  provides  the  bulk  of  chemical 
energy  in  all  animals,  is  driven  by  the  proton  gradient  established  by  the  respiratory 
complexes  I,  III,  and  IV  during  electron  transport  from  catabolically  generated 
reducing  equivalents  to  molecular  oxygen  [25].  ATP  synthesis  by  the  chloroplast 
enzyme  in  green  plants  is  driven  by  a  proton  gradient  established  by  the  photosys- 
tems  as  a  result  of  sunlight-driven  oxidation  of  water  into  reduction  equivalents 
(NADPH)  and  molecular  oxygen. 

Depending  on  the  type  of  membrane,  the  individual  contributions  of  membrane 
potential  and  pH  difference  can  vary:  most  of  the  driving  force  in  animal  mitochon- 
dria is  contributed  by  an  electrical  potential  difference  with  only  a  minor  difference 
in  pH  between  inter-membrane  space  and  matrix  (~1  pH  unit),  allowing  proteins  to 
function  on  both  sides  of  the  inner  membrane.  In  the  thylakoid  membrane  of  chlo- 
roplasts, however,  the  free  energy  difference  is  dominated  by  a  proton  gradient  that 
can  reach  several  pH  units  during  light-driven  electron  transport  (~4  pH  units). 
Here,  only  few  functional  proteins  are  found  in  the  thylakoid  lumen  that  have 
adapted  to  function  even  under  relatively  acidic  conditions  (~pH  4). 

Most  of  what  we  know  about  the  A-  and  A/V-like  ATPases  has  been  derived  from 
studies  with  the  bacterial  enzymes  from,  e.g.,  T.  thermophilus  [14]  and  Enterococcus 
hirae  [26].  Due  to  their  similarity  in  subunit  composition  and  overall  architecture,  it 
is  generally  assumed  that  both  rotary  ATPase  subtypes  share  a  similar  enzymatic 
mechanism,  and  recent  progress  with  purification  and  structural  characterization  of 
the  archaeal  enzyme,  despite  the  difficulties  with  growing  the  often  extremophilic 
organisms  in  the  laboratory,  appears  to  confirm  the  earlier  assumption  [11,  12].  The 
catalytic  mechanism  of  the  T.  thermophilus  A/V- ATPase  (both  for  the  intact  com- 
plex and  the  soluble  ATPase  sector)  has  been  studied  extensively  by  using  the  single 
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molecule  observation  methods  developed  for  characterizing  rotational  catalysis  of 
the  F-ATPase  [27]  (see  below). 

While  F-  and  A-ATPase  can  function  with  high  efficiency  in  both  ATP  synthesis 
and  ATP  hydrolysis  directions,  the  eukaryotic  vacuolar  ATPase  (V-ATPase)  is  a 
dedicated  proton  pump,  hydrolyzing  MgATP  to  acidify  the  lumen  of  subcellular 
organelles  or  the  extracellular  space  in  some  specialized  or  polarized  cells  such  as 
renal  intercalated  cells  or  bone  osteoclasts  [5-8] .  This  difference  in  physiological 
function  between  F-  and  A- ATPase  on  the  one  and  V- ATPase  on  the  other  hand  is 
not  a  fundamental  difference  in  enzymatic  mechanism  as  eukaryotic  V- ATPase  can 
be  made  to  reverse  direction  and  synthesize  ATP,  albeit  at  low  efficiency  [28].  Over 
the  course  of  evolution,  V-ATPase  has  acquired  additional  subunits  that  have  no 
counterparts  in  the  A-  or  F- ATPase.  In  addition,  the  proton  pumping  activity  of  the 
eukaryotic  V-ATPase  has  been  shown  to  be  regulated  by  a  unique  mechanism 
referred  to  as  reversible  dissociation.  In  yeast,  for  example,  under  conditions  of 
glucose  deprivation,  VrATPase  disengages  from  the  membrane  integral  VQ  proton 
turbine  with  concomitant  silencing  of  ATPase  and  proton  translocation  activities 
and,  at  the  same  time,  one  of  the  V-ATPase  subunits,  C,  is  released  from  both 
domains  of  the  complex  [29].  V-ATPase  dissociation  is  reversible;  however,  reas- 
sembly of  intact  V-ATPase  from  cytoplasmic  V1?  membrane-bound  VQ,  and  subunit 
C  requires  catalytic  action  of  a  chaperone  complex  called  RAVE  (regulator  of  H+- 
ATPases  of  vacuolar  and  endosomal  membranes)  [30].  RAVE  has  been  shown  to 
bind  subunits  of  the  Vi  and  the  chaperone  likely  acts  as  a  scaffold  to  bring  back 
together  all  components  of  the  enzyme  in  a  spatially  ordered  fashion.  Nutrient 
availability-dependent  reversible  dissociation  or  developmentally  regulated  assem- 
bly of  V-ATPase  is  also  found  in  insects  and  higher  animals  [31-33],  suggesting  that 
this  unique  mode  of  activity  regulation  has  been  conserved  throughout  evolution. 
Most  importantly,  eukaryotic  V-ATPase  has  been  shown  to  be  involved  in  or  even 
responsible  for  a  number  of  widespread  human  diseases  such  as  osteoporosis  [34], 
renal  tubular  acidosis  [35,  36],  diabetes  [37],  sensorineural  deafness  [38],  and  can- 
cer [39].  Due  to  its  established  role  in  human  health,  the  structure  and  mechanism 
of  activity  regulation  by  reversible  disassembly  of  the  V-ATPase  are  actively  studied 
by  many  groups  and  efforts  are  underway  to  identify  small  molecule  compounds 
that  may  be  used  to  modulate  the  proton  pumping  activity  of  the  enzyme  [40,  41]. 


10.3    Structure  of  the  Rotary  Motor  ATPases 

Rotary  motor  ATPases  are  large,  multisubunit  enzyme  complexes  composed  of  two 
major  functional  units,  a  membrane  peripheral,  water-soluble  ATPase  that  is  char- 
acterized by  an  alternating  hexameric  arrangement  of  three  catalytic  and  three  non- 
catalytic  nucleotide  binding  subunits,  and  a  membrane-embedded  ion  channel  made 
of  a  ring  of  hydrophobic  "proteolipid"  subunits  with  a  transmembrane  subunit 
located  at  the  outer  periphery  of  the  ring  (see  Fig.  10.1  and  Table  10.1).  The  cyto- 
plasmic ATPase  (referred  to  as  F1?  Al9  or  Vi)  is  connected  to  the  membrane  sector 
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Table  10.1  Subunit  composition  of  F-,  A-,  and  V-ATPases 


Sector 

F-ATPase 

A-ATPase 

V-ATPased 

Function 

Airase 

F, 
1 1 

rsactenai 

iviitocnonariai 

A 

Vi 

a  (3) 

a  (3) 

B(3) 

B(3) 

Non-catalytic 

P(3) 

P(3) 

A  (3) 

A  (3) 

Catalytic 

Y(l) 

Y(l) 

D(l) 

D(l) 

Rotor 

8(1) 

OSCP(l) 

Stator  stalk 

8(1) 

8(1) 

F(l) 

F(l) 

Rotor,  regulatory 

8(1) 

Rotor,  regulatory 

d(l) 

E(2) 

E(3) 

Stator  stalk 

h(l) 

G(2) 

G(3) 

Stator  stalk 

C(l) 

Stator 

H(l) 

Regulation 

T                1  1 

Ion  channel 

F„ 

A 

A0 

v„ 

a{\) 

a  (I) 

ad) 

Stator,  ion  channel 

b{2) 

b(X) 

Stator  stalk 

c (8-15) 

c  (8-10) 

c  (10)c 

c,  c',  c"  (10?) 

Rotor,  ion  binding 

J(l)c 

d(D 

Coupling  spacer 

f,  e,  g,  8,  i,  k 

Dimerization 

e 

Unknown 

aSubunit  nomenclature  of  the  bacterial  enzyme 
bFor  yeast  mitochondrial  F-ATPase 

cSubunits  a,  c,  and  d  in  the  A  and  A/V-type  enzymes  are  also  called  I,  K,  and  C,  respectively 
dFor  yeast  vacuolar  ATPase 


(F0,  A0,  or  V0)  via  a  rotating  "central  stalk"  that  couples  conformational  changes  in 
the  catalytic  subunits  to  c  ring  rotation,  and  a  static  "stator  stalk"  that  functions  to 
resist  the  torque  of  rotational  catalysis  and  keep  the  ATPase  and  ion  channel  in  the 
correct  spatial  arrangement  relative  to  one  another  for  efficient  energy  coupling  to 
take  place.  While  X-ray  crystallography  has  provided  atomic  resolution  detail  for 
the  ATPase  sectors,  proteolipid  rings,  and  stator  stalks  from  a  wide  variety  of  rotary 
ATPases,  only  electron  microscopic  reconstructions  are  available  for  the  overall 
structure  of  intact  enzyme  complexes. 


10.3.1    Structural  Information  from  X-Ray  Crystallography: 
ATPase  Sectors,  Rotor  Rings,  and  Stator  Stalks 

Mitochondria-rich  animal  tissues  allowed  purification  of  large  amounts  of  Fr 
ATPase  and  most  of  the  early  biochemical  and  structural  studies  were  conducted 
with  the  enzymes  from,  e.g.,  beef  heart  or  rat  liver  [42,  43].  The  first  atomic  resolu- 
tion structure  was  reported  for  the  beef  heart  enzyme  and,  at  the  time,  the  structure 
represented  the  largest  asymmetric  bio  macromolecule  solved  to  date  (-380  kDa; 
see  Fig.  10.2a)  [44].  A  novel  solvent  flattening  protocol  developed  specifically  for 
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Fig.  10.2  Crystal  structure  of  mitochondrial  F-ATPase  catalytic  and  proteolipid  domains,  (a)  Side 
view  parallel  to  the  membrane  of  the  X-ray  crystal  structure  of  bovine  heart  FrATPase  (le79).  (b)  A 
view  of  the  structure  from  the  membrane  surface  towards  the  matrix.  The  catalytic  sites  are  in  three 
different  conformations  with  an  empty  (pE),  a  MgAMPPNP-bound  site  (Ptp),  and  an  MgADP-bound 
site  (PDp).  (c)  Crystal  structure  of  the  rotor  ring  of  yeast  F-ATPase  (3u2f).  The  Glu59  sidechain  is 
shown  in  spacefill,  (d)  The  three  p  subunits  in  the  AMPPNP,  ADP,  and  empty  conformations 


improving  the  electron  density  maps  was  crucial  for  tracing  most  of  the  catalytic 
and  non-catalytic  a  and  p  and  part  of  the  y  subunits  [45].  The  most  striking  aspect 
of  the  beef  heart  Fi  structure  was  that  the  three  catalytic  nucleotide  binding  sites 
were  seen  in  three  different  conformations.  One  catalytic  site  had  the  non- 
hydrolyzable  ATP  analog  AMPPNP  bound,  another  one  had  ADP  bound,  and  the 
third  one  was  empty  (see  Fig.  10.2b,  d).  This  configuration  of  the  three  catalytic 
sites  was  exactly  what  had  been  proposed  more  than  10  years  earlier  based  on  bio- 
chemical experiments  in  what  was  generally  known  under  the  name  binding  change 
mechanism  of  FrATPase  [18].  The  model  predicted  that  at  any  given  time,  one  cata- 
lytic site  has  ATP  tightly  bound,  one  site  has  ADP  loosely  bound,  and  that  one  site 
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is  empty  (see  below  for  details  of  the  binding  change  mechanism).  Subsequent 
structures  of  the  beef  heart  [46-48],  rat  liver  [49],  chloroplast  [50],  yeast  [51,  52], 
and  E.  coli  enzymes  [53]  crystallized  in  the  presence  of  various  nucleotide  and/or 
inhibitor  combinations  revealed  a  variety  of  conformations  of  the  catalytic  nucleo- 
tide binding  subunits  including  one  where  all  three  sites  were  (at  least  partially) 
filled  with  nucleotide  [47].  In  the  first  beef  heart  structure,  only  part  of  the  central 
rotor  subunit,  y,  was  resolved  while  the  middle  portion  of  y  as  well  as  the  other  two 
rotor  subunits,  5  and  e  (subunit  nomenclature  of  the  mitochondrial  enzyme),  was 
not  visible  due  to  disorder.  The  subunits  and  subunit  domains  not  seen  in  the  first 
structure,  however,  could  be  resolved  using  data  collected  from  crystals  that  were 
slightly  more  compact  likely  due  to  dehydration  [46].  Crystal  structures  are  also 
available  for  the  ATPase  sectors  of  the  bacterial  A/V-type  ATPase  from  T.  ther- 

o 

mophilus,  albeit  at  moderate  resolutions  of  between  2.8  and  4  A  [54,  55]. 

After  solving  the  beef  heart  Fl  structure,  efforts  intensified  to  obtain  a  structure 
of  the  intact  FxFo-ATP  synthase  and  while  crystals  of  the  detergent- solubilized  yeast 
and  bovine  enzymes  could  be  obtained  [52,  56,  57],  they  only  revealed  electron 
density  for  the  ATPase  sector  (a3p3y58)  and  the  ring  of  proteolipids  (c10  for  yeast,  c8 
for  bovine).  While  the  quality  of  the  electron  density  in  the  first  yeast  FrCi0  struc- 
ture was  not  sufficient  for  unambiguous  tracing  for  the  proteolipid  chains,  the  data 
clearly  showed  that  there  were  10  c  subunits  in  the  ring  [56],  a  number  that  had  been 
speculated  about  extensively  based  on  biochemical  and  crosslinking  data  conducted 
with  the  enzyme  from  E.  coli  [58,  59]. 

The  first  high  resolution  X-ray  crystal  structure  for  an  isolated  c  subunit  ring  was 
obtained  from  crystals  of  purified,  SDS-resistant  proteolipid  rings  of  the  bacterial 
sodium  pumping  F- ATPase  from  Ilyobacter  tartaricus  [60].  The  structure  of  the  /.  tar- 
taricus  cn  ring  revealed  the  interaction  of  the  individual  c  subunits  and  the  residues 
involved  in  sodium  binding.  Following  the  /.  tartaricus  ring  structure,  a  series  of  other 
rings  with  varying  c  subunit  stoichiometries  were  reported,  including  a  c10  ring  of  yeast 
F- ATPase  [61],  a  c14  ring  from  the  spinach  chloroplast  enzyme  [62],  and  a  15-c  subunit 
ring  from  the  proton  transporting  F- ATPase  from  Spirulina  platensis  [63].  The  only  ring 
seen  so  far  with  12  c  subunits  is  the  one  from  T.  thermophilus  A/V-type  ATPase  [64, 65], 
though  no  high  resolution  structure  is  currently  available.  A  structure  has  been  obtained 
for  the  10-c  subunit  ring  of  the  sodium  transporting  A/V-like  ATPase  from  E.  hirae  [66]. 
However,  the  E.  hirae  c  subunits  are  more  similar  to  the  eukaryotic  V- ATPase  in  that 
they  contain  four  transmembrane  segments,  with  the  proton  or  sodium  binding  carboxyl 
residue  in  the  C-terminal  membrane  spanning  a  helix.  With  a  total  of  40  transmembrane 
a  helices,  the  E.  hirae  ring  is  the  largest  observed  so  far. 

Dividing  the  number  of  c  subunits  by  three  gives  the  number  of  ions  that  need  to 
cross  the  membrane  sector  for  the  synthesis  or  hydrolysis  of  one  molecule  of  ATP, 
provided  that  each  ion  translocated  is  coupled  to  rotation  of  the  central  stalk.  From  the 
above  c  subunit  stoichiometries,  it  follows  that  the  ion  to  ATP  ratio  varies  from  -2.7 
(for  bovine  mitochondrial  FxF0)  to  5  (for  the  enzyme  from  S.  platensis).  Interestingly, 
experimental  determination  of  the  H+/ATP  ratio  revealed  that  four  protons  have  to  be 
translocated  per  ATP  in  both  the  chloroplast  and  E.  coli  ATP  synthases  [67],  suggest- 
ing a  possible  uncoupling  of  proton  translocation  and  rotor  movement  (slippage) 
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under  some  conditions.  Figure  10.2c  shows  the  recent  high  resolution  crystal  structure 
of  the  cm  proteolipid  ring  from  the  F-ATPase  from  S.  cerevisiae  [61],  highlighting  the 
carboxylate  residues  (Glu59)  located  in  the  middle  of  the  C-terminal  helix  of  the  c 
subunit  where  they  are  exposed  to  the  middle  of  the  lipid  bilayer. 

As  discussed  in  more  detail  below,  F-ATPase  has  a  single  stator  stalk  that  functions 
to  link  the  soluble  ATPase  with  the  membrane  sector  [68].  The  bacterial  stator  stalk  is 
composed  of  a  homodimer  of  b  subunits,  with  a  more  complex  composition  in  the  mito- 
chondrial enzyme  [69]  (Fig.  10.1a).  The  stator  stalks  in  A-  and  V- ATPase  are  formed 
from  heterodimers  of  subunits  E  and  G  and  while  there  are  two  copies  of  the  stator  stalk 
in  A- ATPase  [70],  eukaryotic  V- ATPase  contains  three  [71]  (Fig.  10.1b,  c).  Crystal 
structures  for  the  mitochondrial  F-ATPase  stator  stalk  as  well  as  bacterial  and  eukaryotic 
EG  heterodimers  have  been  obtained  [69,  72-74],  showing  the  A-  and  V- ATPase  stator 
stalks  folded  in  a  helical  coiled  coils  with  an  unusual  right-handed  twist. 

Together,  the  atomic  resolution  X-ray  structures  of  ATPases,  rotor  rings,  and  sta- 
tor stalks  provide  essential  information  on  the  catalytic  mechanism  of  the  ATPase 
and  allowed  generation  of  plausible  models  of  how  the  free  energy  of  ATP  synthesis 
or  hydrolysis  released  or  consumed  on  the  ¥x  is  coupled  to  movements  of  ions  across 
the  membrane-bound  F0.  However,  X-ray  crystallography  as  of  today  has  not  been 
able  to  provide  a  picture  of  an  intact  member  of  the  rotary  motor  ATPase  family, 
most  likely  due  to  the  labile  nature  of  the  interaction  of  the  proteolipid  ring  with  the 
membrane-bound  a  subunit.  Structural  models  of  intact  rotary  ATPases  are  avail- 
able from  single  particle  3D  EM  reconstructions  and  while  the  earlier  models  from 
negatively  stained  specimens  were  obtained  at  low  to  moderate  resolutions  of 

o 

between  20  and  30  A,  advances  in  cryo  electron  microscopy  have  recently  allowed 

o 

generation  of  models  in  the  10-20  A  resolution  range  with  one  model  providing 

o 

subnanometer  (9.7  A)  detail  [65]  (see  below). 


10.3.2    Structural  Information  from  Protein  NMR  Spectroscopy: 
Individual  Subunits  and  Subunit  Domains 

The  first  F-ATPase  subunit  to  be  analyzed  by  protein  NMR  spectroscopy  was  the  c 
subunit  of  the  proteolipid  rotor  ring  of  the  FQ  sector  of  E.  coli  F^-ATP  synthase. 
The  highly  hydrophobic  a  helical  hairpin  can  be  readily  obtained  from  E.  coli  inner 
membranes  by  extraction  with  organic  solvent.  Multidimensional,  heteronuclear 
NMR  spectroscopy  of  the  subunit  in  a  mixture  of  chloroform/methanol/water 
allowed  determination  of  the  proteolipid  structure,  initially  at  neutral  pH  [75].  A 
later  structure  calculated  from  data  collected  at  acidic  pH  (5.5)  revealed  a  different 
conformation,  in  which  the  C-terminal  helix  that  contains  the  proton  binding  Asp61 
residue  was  rotated  close  to  180°  [76].  However,  recent  crystal  structures  of  intact 
rotor  rings  (see  above)  suggest  that  protonation/deprotonation  of  the  carboxylate 
involves  only  local  changes  in  sidechain  conformation  and  it  needs  to  be  seen 
whether  the  pH-induced  structural  changes  observed  for  the  c  subunit  monomer 
play  a  role  in  the  proton  translocation  mechanism. 
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While  the  first  crystallographic  structure  of  bovine  FrATPase  allowed  detailed 
insight  into  the  structure  and  conformations  of  the  catalytic  and  non-catalytic  sub- 
units,  part  of  the  y  as  well  as  the  5  subunit  was  not  resolved  due  to  disorder  in  the 
crystal  lattice  [44].  However,  biochemical  experiments  conducted  with  the  F-ATPase 
from  E.  coli  suggested  that  especially  the  e  subunit  (the  homolog  to  mitochondrial  5 
subunit,  see  Table  10.1)  played  a  critical  role  in  energy  coupling  by  connecting  the 
central  rotor  (y  subunit)  to  the  proteolipid  ring  of  the  proton  channel  [77] .  The  struc- 
ture of  isolated  E.  coli  e  subunit  was  subsequently  solved  by  NMR  spectroscopy  [78, 
79]  and  X-ray  crystallography  [80].  NMR  and  crystal  structure  agreed  well,  showing 
the  subunit  to  be  folded  in  an  N-terminal  p  sandwich  and  a  C-terminal  a  helix-turn- 
helix  that  could  be  seen  packed  against  one  side  of  the  p  sandwich.  Interestingly, 
while  isolated  E.  coli  e  adopts  a  compact  conformation  in  solution,  a  recent  X-ray 
crystal  structure  of  E.  coli  FrATPase  showed  e  in  an  extended  structure  with  the 
subunit's  C-terminal  a  helix  inserted  deeply  into  the  catalytic  oc3p3  hexamer  where  it 
interacts  with  and  thereby  links  catalytic  p  and  central  rotor  y  subunits  [53].  Certain 
bacteria  utilize  F-ATPase  in  both  ATP  synthesis  and  proton  pumping  directions  and 
it  is  believed  that  inhibition  by  8  subunit  serves  to  prevent  wasteful  ATP  hydrolysis 
under  some  growth  conditions  [53].  In  the  mitochondrial  enzyme,  the  extended  con- 
formation of  the  homologous  mitochondrial  subunit  (5)  is  prevented  by  another 
small  subunit  (called  mitochondrial  8  subunit;  not  to  be  confused  with  bacterial  8 
subunit,  see  Table  10.1)  with  no  counterpart  in  bacterial  F^  that  binds  right  next  to 
mitochondrial  5  [46] .  However,  wasteful  ATP  hydrolysis  in  mitochondrial  F-ATPase 
is  prevented  by  pH-dependent  binding  of  an  inhibitor  protein  ("inhibitory  factor 
one,"  IF1),  that  functions  in  a  similar  fashion  as  E.  coli  e  subunit  C-terminal  domain 
in  that  it  inserts  into  the  catalytic  interface  from  the  bottom  of  the  oc3p3  hexamer  [81, 
82].  The  structure  of  IF1  has  been  determined  by  NMR  spectroscopy  and  contains  a 
C-terminal  dimerization  and  an  N-terminal  inhibitory  domain  [83]. 

Protein  NMR  spectroscopy  was  also  successful  in  determining  the  structure  of 
the  N-terminal  domain  of  the  bacterial  5  [84]  and  its  mitochondrial  homologue,  the 
OSCP  subunit  [85],  which  were  seen  to  form  a  compact  a  helical  bundle  binding  to 
the  very  top  of  the  a3p3  hexamer  via  the  N-terminal  19  residues  of  one  of  the  a  sub- 
units  [85,  86]. 


10.3.3    Structural  Information  from  Electron  Microscopy: 
From  Projection  Images  to  3D  Reconstructions 

The  first  structural  information  for  the  rotary  ATPases  was  obtained  using  trans- 
mission electron  microscopy  (TEM)  of  negatively  stained  mitochondrial  mem- 
brane preparations  [42] .  The  images  showed  globular  proteins  with  a  diameter  of 
approximately  10  nm  that  were  attached  to  the  membrane  by  2-  to  3-nm-long 
slender  stalks.  Treatment  of  the  membranes  with  low  ionic  strength  buffer  resulted 
in  a  water-soluble  ATPase  and  a  membrane  fraction  that  had  no  ATP  hydrolyzing 
activity  [87].  TEM  images  of  the  globular  particles  detached  from  the  membrane 
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revealed  the  hexameric  architecture  of  the  ATPase  sector  but  the  limited  resolution 
of  the  negative  stain  images  did  not  allow  localization  of  the  single  copy  subunits 
of  the  complexes.  This  was  accomplished  by  a  combination  of  cryo  electron 
microscopy  and  enzyme  decoration  with  monoclonal  antibodies  and  antibody  Fab 
fragments  [88].  TEM  analysis  of  detergent-solubilized  membranes  revealed  the 
characteristic  dumbbell  appearance  of  the  intact  F^-ATP  synthase  particles  [89] 
and  the  application  of  cryo  electron  microscopy  for  visualizing  lipid  vesicle- 
reconstituted  enzyme  confirmed  the  existence  of  the  central  stalk  connecting  Fx 
and  FQ  sectors  [90,  91].  Subsequent  application  of  statistical  image  analysis  tech- 
niques revealed  the  presence  of  a  second  or  stator  stalk  that  could  be  seen  to  con- 
nect the  top  of  the  ATPase  to  the  membrane  at  the  periphery  of  the  complex 
[92-94].  This  structure  was  later  shown  to  contain  the  5  and  b  subunits  of  the 
enzyme  (bacterial  subunit  nomenclature)  [95].  Such  stator  stalks  were  also 
observed  for  the  bacterial  A/V-type  ATPase  [96]  and  later  for  the  eukaryotic  vacu- 
olar ATPase  [97]  and,  as  mentioned  above,  it  is  now  established  that  F- ATPase  has 
a  single  stator  stalk  composed  of  the  5  and  b  subunits  (bacterial  enzyme  nomen- 
clature [68];  the  single  mitochondrial  F- ATPase  stator  stalk  contains  additional 
subunits  [69])  while  archaeal  A-  and  bacterial  A/V- ATPase  and  eukaryotic 
V- ATPase  have  2  [70]  and  3  [71],  respectively.  The  stator  stalks  in  rotary  ATPases 
are  formed  by  homo-  (for  the  bacterial  enzyme)  or  heterodimeric  coiled  coil  pro- 
teins with  an  unusual  right-handed  twist  [72-74,  98].  Unlike  in  F- ATPase,  the 
A-  and  V- ATPase  stator  stalks  are  not  membrane- anchored  but  connect  to  subunits 
and  subunit  domains  in  the  ATPase-ion  channel  interface  for  which  there  are  no 
homologs  in  the  F- ATPase  (see  Fig.  10.1). 

Early  3D  EM  reconstructions  of  F-  [99],  A-  [100,  101],  and  V-ATPases  [102- 

o 

105]  calculated  at  resolutions  of  between  15  and  30  A  allowed  determination  of 
the  overall  subunit  architecture  of  the  complexes.  Placing  of  crystal  structures  of 
catalytic  sectors,  proteolipid  rings,  and  individual  subunits  into  the  EM-derived 
atomic  density  maps  produced  first  "pseudo  atomic  resolution"  models  of  the 
rotary  ATPases.  The  limited  resolution  of  these  maps,  however,  sometimes  pro- 
duced contradictory  interpretations  as  to  the  placement  of  some  of  the  subunits.  A 
more  recent  application  of  cryo  electron  microscopy  allowed  determination  of  a 
3D  map  of  the  A/V-type  ATPase  from  T.  thermophilus  that  provided  unprece- 

o 

dented  structural  detail  [65].  At  9.7  A  resolution,  the  EM  density  was  able  to 
resolve  individual  c  subunits  in  the  proteolipid  ring  and  provided  a  first  glimpse 
into  the  interaction  of  the  c  subunit  ring  with  the  membrane-bound  domain  of  the 
a  subunit.  The  model  clearly  resolved  eight  transmembrane  a  helices  for  the 
C-terminal  domain  of  the  a  subunit,  consistent  with  earlier  biochemical  studies 
conducted  with  the  homologous  subunit  of  the  yeast  vacuolar  ATPase  [106].  The 
eight  TM  segments  appeared  to  be  packed  as  2  four-a  helix  bundles,  leading  the 
authors  of  the  EM  study  to  speculate  that  each  four-helix  bundle  formed  one  of  the 
water-accessible  half  channels  that  had  been  postulated  earlier  to  form  an  integral 
part  of  the  proton  pathway  across  the  subunit  a-c  ring  interface  in  F- ATPase  [107, 
108].  A  selection  of  the  most  recent  cryo  EM  reconstructions,  fitted  with  the  avail- 
able crystal  structures,  is  shown  in  Fig.  10.3. 
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Fig.  10.3  3D  EM  reconstructions  of  F-,  A/V-,  and  V-ATPase.  (a-c)  Cryo  EM  reconstructions  of 
yeast  F-ATPase  (emd-2011),  A/V-type  ATPase  of  T.  thermophilus  (emd-5335),  and  insect 
V-ATPase  (emd-1590).  The  crystal  structures  used  for  fitting  were  in  (a)  bovine  F:  (le79),  OSCP 
(2bo5),  stator  stalk  assembly  (2cly),  and  rotor  ring  (3u2f);  in  (b)  A3B3  (3a5c),  DF  (3aon),  EG  het- 
erodimer  (3k5b),  a  subunit  N-terminal  domain  (3rrk),  rotor  ring  with  12  c  subunits  (modeled; 
lcl7),  and  subunit  d  (lr5z);  and  in  (c),  as  in  (b)  plus  subunit  C  (lu71),  subunit  H  (lho8)  and  the  K10 
ring  of  E.  hirae  (2bl2) 

10.3.4    Structural  Information  from  Native  Mass  Spectrometry 

Mass  spectrometry  (MS)  has  become  a  powerful  tool  for  analyzing  structural 
features  of  large  protein  complexes.  In  native  mass  spectrometry,  protein  complexes 
are  introduced  into  the  gas  phase  by  electrospray  ionization  (ESI)  or  laser-induced 
liquid  bead  ion  desorption  (LILBID)  without  denaturant,  and,  under  suitable  condi- 
tions, the  proteins  travel  as  intact  multi- subunit  complexes  through  the  vacuum  of 
the  spectrometer.  Mass  spectrometry  of  native  protein  complexes  can  provide  essen- 
tial information  which  is  often  difficult  to  obtain  by  classical  biochemical  methods 
such  as  subunit  stoichiometry  and,  in  case  of  membrane  proteins,  lipid  binding. 
Information  about  conformation  can  be  obtained  from  an  analysis  of  the  proteins' 
cross  section  using  ion  mobility  mass  spectrometry.  Several  rotary  motor  ATPases 
and  their  sub-complexes  have  been  analyzed  by  native  MS.  Native  ESI-MS  analysis 
conducted  with  A/V-type  ATPase  from  T.  thermophilus  and  VrATPase  from  yeast 
allowed  unambiguous  determination  of  the  subunit  stoichiometry,  including  the 
presence  of  2  and  3  EG  heterodimer  stator  stalks,  respectively  [70,  71].  LILBID 
analysis  of  the  A- ATPase  from  the  hyperthermophilic  archaeon  Pyrococcus  furio- 
sus  allowed  determination  of  the  subunit  stoichiometry  of  the  intact  enzyme, 
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including  the  number  of  c  subunits  in  the  membrane-bound  proteolipid  ring,  information 
that  has  been  difficult  to  obtain  by  traditional  biochemical  methods  [100].  More 
recently,  ESI  native  mass  spectra  of  unprecedented  resolution  and  clarity  could  be 
obtained  for  the  intact  A/V-type  ATPases  from  T.  thermophilus  and  (to  a  somewhat 
lesser  degree)  E.  hirae,  allowing  detection  of  nucleotide-dependent  conformational 
changes  in  the  ATPase  sector  and  binding  of  specific  lipids  to  the  detergent- 
solubilized  membrane  sector  [109]. 

10.3.5    Structural  Information  from  Small  Angle 
X-Ray  Scattering 

Much  like  TEM,  small  angle  X-ray  scattering  of  biological  macromolecules 
(BioSAXS)  was  one  of  the  early  biophysical  tools  that  was  applied  to  obtain  struc- 
tural information  for  rotary  motor  ATPases  and  their  functional  domains.  Initial 
pioneering  studies  were  limited  to  the  analysis  of  hydrodynamic  parameters  includ- 
ing radius  of  gyration  (Rg)  and  pairwise  distance  distribution  functions  (P(r)),  pro- 
viding overall  information  on  molecular  mass  and  size  of  the  complexes  in  solution 
[110].  Subsequent  development  of  algorithms  that  allowed  shape  determination 
from  SAXS  intensities  were  used  to  obtain  low  resolution  solution  structures  of  Fr 
and  Vr  ATPase  sectors  as  well  as  individual  subunits  and  subunit  complexes  from 
the  eukaryotic  V- ATPase  [11 1-1 14].  In  one  study,  SAXS  was  used  to  determine  the 
shape  of  yeast  V- ATPase  stator  stalk  complex  bound  to  the  regulatory  C  subunit, 
revealing  an  L-shaped  structure  [104].  Interestingly,  the  resulting  envelope  did  not 
match  the  atomic  density  for  the  stator  stalk  as  seen  in  a  negative  stain  3D  EM 
reconstruction,  a  mismatch  that  leads  the  authors  of  the  study  to  speculate  that  the 
stator  stalks  have  to  change  conformation  from  a  solution  state  to  the  structure  when 
bound  in  the  assembled  enzyme  [104]. 


10.4    Towards  the  Mechanism  of  Rotary  Catalysis 

Conducted  parallel  to  the  structure  determination  of  the  rotary  ATPases,  which 
started  with  the  early  electron  microscopy  studies  and  is  still  ongoing  to  date,  were 
efforts  addressed  at  determining  the  molecular  mechanism  of  enzyme  catalysis,  that 
is,  reversible  conversion  of  the  potential  energy  stored  in  a  transmembrane  ion  gra- 
dient to  the  chemical  energy  stored  in  the  off  equilibrium  mass  action  ratio  of  ATP 
and  ADP  in  the  cell.  The  process  of  ion  gradient-driven  ATP  synthesis  or  ATP 
hydrolysis-driven  buildup  of  an  ion  gradient  is  generally  referred  to  as  "energy  cou- 
pling" or  "energy  conversion." 
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10.4.1    The  Beginnings:  180  Exchange  and  a  Hypothesis 

Early  experiments  had  shown  that  F-ATPase  contains  three  catalytic  sites  (on  the  Fi) 
but  only  a  single  proton  channel  (in  the  transmembrane  FQ).  How  do  the  three  cata- 
lytic sites  communicate  with  each  other  and  how  are  events  in  three  catalytic  sites 
coupled  to  a  single  proton  pore?  Early  experiments  conducted  by  the  groups  of  Paul 
Boyer  (using  18oxygen  exchange  kinetics)  and  Harvey  Penefsky  (employing  quench 
flow  kinetics)  showed  that,  under  limiting  ATP  concentrations,  the  equilibrium  con- 
stant of  the  ATP  hydrolysis  reaction  on  the  catalytic  sites  is  close  to  unity,  suggesting 
that  much  of  the  energy  of  the  ion  gradient  during  ATP  synthesis  is  used  to  open  the 
catalytic  site  to  allow  release  of  synthesized  ATP  and  not  for  the  synthesis  reaction 
itself  [18, 115, 116].  These  and  other  experiments  then  allowed  Paul  Boyer  to  formu- 
late the  "binding  change  mechanism"  of  F-ATPase,  that,  as  mentioned  above,  pre- 
dicted that  three  catalytic  sites  at  any  given  time  were  in  different  conformations, 
with  one  tight  (ATP-bound)  site,  one  loose  (ADP-bound)  site,  and  one  empty  site. 
During  ATP  synthesis,  each  catalytic  site  undergoes  cyclic  changes  from  open 
(empty)  to  loose  (ADP-bound)  to  tight  (ATP)  and  opens  again  with  the  reversed 
order  during  ATP  hydrolysis.  The  cyclic  conversions  of  each  site  are  coupled  to  the 
two  other  sites,  120°  and  -120°  out  of  phase.  While  ATP  binding  was  found  to  occur 
with  strongly  negative  cooperativity,  the  turnover  rate  increases  from  the  so-called 
unisite  catalysis,  where  only  one  site  operates  (due  to  limiting  ATP  concentration)  to 
multi-site  catalysis  by  a  factor  of  ~106  [18,  117].  However,  the  notion  that  maximal 
turnover  rate  can  be  achieved  with  only  two  catalytic  sites  filled  has  been  challenged 
based  on  tryptophan  fluorescence  quenching  experiments  that  showed  that  maxi- 
mum turnover  in  E.  coli  FrATPase  requires  nucleotide  occupancy  of  all  three  sites 
[22,  118]  as  seen  in  the  three-nucleotide  X-ray  crystal  structure  of  bovine  Fx  [47]. 

A  further  prediction  by  Boyer  was  that  coupling  of  the  three  catalytic  sites  to  the 
single  ion  channel  would  involve  rotation  of  some  (or  all)  of  the  single  copy  sub- 
units  of  the  enzyme.  In  1997,  Paul  Boyer  (for  the  catalytic  mechanism)  and  Sir  John 
Walker  (for  the  crystal  structure  determination  of  bovine  FrATPase)  were  awarded 
the  Nobel  Prize  in  chemistry  (http://nobelprize.org/nobel_prizes/chemistry/ 
laureates/ 1997). 


10.4.2    Proving  Rotational  Catalysis:  Or  Putting  Biophysics 
to  the  Test 

While  the  hypothesis  of  subunit  rotation  seemed  plausible  early  on  considering 
the  symmetry  mismatch  between  three  catalytic  sites  and  one  ion  channel,  vali- 
dating or  invalidating  this  hypothesis  proved  much  more  difficult  than  antici- 
pated. For  example,  biochemical  experiments  using  photo  or  disulfide 
crosslinking  were  mostly  in  support  of  rotational  catalysis  in  that  they  showed 
that  linking  putative  stator  and  rotor  subunits  leads  to  enzyme  inhibition  while 
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linking  rotor  and  rotor  or  stator  and  stator  components  had  little  effect  [24,  77]. 
However,  in  the  end,  the  static  crosslinking  experiments  fell  short  of  a  definitive 
proof  as  there  may  be  alternative  explanations  to  enzyme  inhibition  upon  cross- 
linking.  More  direct  evidence  was  obtained  using  reversible  crosslinks  that 
showed  that  the  rotor  subunit  is  indeed  moving  between  different  catalytic  sub- 
units  [119]  but  by  that  time,  researchers  realized  that  only  real-time  experiments 
could  provide  sufficient  evidence  for  rotational  catalysis.  The  first  observation 
of  rotor  dynamics  on  a  timescale  of  catalysis  was  provided  by  fluorescence 
anisotropy  relaxation  measurements  of  immobilized  FrATPase  molecules — but 
for  technical  reasons,  even  these  experiments  were  limited  to  showing  a  rotation 
angle  short  of  a  full  rotation  (±200°)  [120]. 


10.4.3    Looking  at  Single  Molecules:  The  Breakthrough 

While  the  sum  of  biochemical  and  biophysical  experiments  available  at  the  time  left 
little  doubt  that  subunit  rotation  was  integral  to  the  mechanism  of  energy  coupling 
in  F-ATPase,  a  direct  and  convincing  proof  was  only  obtained  from  single  molecule 
observation  that  unequivocally  showed  unidirectional  and  continuous  rotation  of  the 
gamma  subunit  in  response  to  ATP  hydrolysis.  The  experimental  setup  consisted  of 
immobilized  oc3p3y  sub-complexes  that  were  tagged  with  a  long  fluorescently  labeled 
actin  filament  attached  to  the  mobile  y  subunit  for  direct  CCD  camera-based  obser- 
vation under  the  fluorescence  microscope  (Fig.  10.4a;  [121]).  The  resulting  video 
recordings  showed  counterclockwise  rotation  of  the  y  subunit  during  ATP  hydroly- 
sis when  viewing  the  complex  from  the  bottom  (corresponding  to  a  direction  from 
the  membrane  surface  towards  cytoplasm).  From  the  rotation  rate  and  length  of  the 
actin  filament,  it  could  be  determined  that  ATP  hydrolysis  generated  a  torque  of  up 
to  80  pN-nm,  suggesting  that  the  FrATPase  motor  was  converting  the  free  energy 
change  generated  during  ATP  hydrolysis  to  mechanical  work  with  near  100  %  effi- 
ciency (this  is  possible  as  FrATPase  is  not  a  heat  engine)  [122]. 

Subsequent  refinement  of  the  experimental  setup  produced  a  wealth  of 
detailed  information  on  the  catalytic  mechanism  of  FrATPase  as  well  as  intact, 
immobilized  F^-ATP  synthase.  A  crucial  improvement  of  the  time  resolution 
could  be  accomplished  by  replacing  the  fluorescent  actin  filaments  with  small 
nylon  beads  [123]  or  gold  nanorods  [124]  in  combination  with  high  speed  CCD 
cameras.  For  example,  improving  the  time  resolution  of  the  imaging  showed 
that  each  120°  rotation  step  is  divided  into  an  80°  sub-step  induced  by  ATP 
binding,  and  a  40°  sub-step  driven  by  phosphate  and/or  ADP  release  [123,  125]. 
Higher  time  resolution  also  allowed  a  thermodynamic  analysis  of  F!  rotation, 
suggesting  that  the  energy  pathway  of  catalysis  is  relatively  flat,  allowing  turn- 
over without  Fi  having  to  overcome  excessive  energy  barriers  [  1 26] .  Sophisticated 
single  molecule  manipulation  using  magnetic  beads  allowed  mechanical  wind- 
ing of  the  y  subunit  and  it  could  be  shown  that  this  forced  y  rotation  was  able  to 
synthesize  ATP  from  ADP  and  inorganic  phosphate  [127]. 
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Fig.  10.4  Experimental  setup  for  single  molecule  observation  by  fluorescence  microscopy  and 
FRET  spectroscopy.  Observation  of  subunit  rotation  via  actin  filament  attached  to  (a)  y  subunit  in 
FrATPase  and  (b)  c  subunit  in  intact  FiF0-ATP  synthase  (adapted  from  [121]  and  [128],  respec- 
tively), (c)  Observation  of  rotation  in  intact  A/V-ATP  synthase  from  T.  thermophilus  via  nylon 
bead  attached  to  the  A  subunit  (adapted  from  [27]).  (d)  FRET-based  observation  of  rotation  in  lipid 
vesicle-reconstituted  E.  coli  ATP  synthase.  Donor  and  acceptor  dyes  were  attached  to  the  y  and  b 
subunits,  respectively  (shown  as  green  and  red  spheres).  Liposomes  containing  in  average  one  ATP 
synthase  molecule  were  equilibrated  in  low  pH  buffer  and  then  diluted  into  basic  buffer  containing 
ADP  and  inorganic  phosphate  (Pi)  to  initiate  ATP  synthesis.  Liposomes  were  allowed  to  diffuse 
through  a  laser  focus  in  a  confocal  microscope,  allowing  recording  of  fluorescence  bursts  for  a 
duration  of  between  50  and  250  ms.  Fluorescence  bursts  showed  three  levels  of  donor  and  acceptor 
intensity,  corresponding  to  the  three  positions  of  the  donor  dye  attached  to  the  rotor.  The  order  of 
the  levels  changed  when  going  from  ATP  synthesis  to  ATP  hydrolysis  (adapted  from  ref.  [133]) 
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The  experimental  setup  used  for  single  molecule  observation  of  intact,  detergent- 
solubilized  ATP  synthase  using  actin  filaments  [128]  and  spherical  beads  as  probe 
is  illustrated  in  Fig.  10.4b,  c.  Similar  experimental  setups  were  then  used  to  show 
rotation  of  the  proteolipid  ring  in  lipid  bilayer-reconstituted,  intact  ATP  synthase 
[129].  Recent  real-time  observation  by  the  atomic  force  microscope  (AFM)  revealed 
that  cyclic  up  and  down  movements  of  the  p  subunit  C-terminal  domains  can  even 
be  observed  in  rotorless  "F1?"  suggesting  that  communication  of  the  subunits  within 
the  ring  of  the  catalytic  core  is  sufficient  for  providing  directionality  [130]. 

The  single  molecule  rotation  experiments  were  quickly  adapted  to  bacterial 
A/V-ATPase  sector  and  intact  yeast  vacuolar  ATPase  and  the  studies  showed  that, 
surprisingly,  A/V-ATPase  appeared  to  use  a  slightly  different  catalytic  mechanism 
in  that  80°  and  40°  sub-steps  were  not  observed  [27,  131].  It  is  possible  that  the 
turnover  kinetics  of  A/V-ATPase  is  different  with  shorter  dwell  times  for  one  of  the 
sub-steps,  too  short  to  be  resolved,  but  it  is  also  possible  that  A/V-ATPase  uses  only 
one  power  stroke,  e.g.,  as  a  result  of  ATP  binding.  The  only  single  molecule  obser- 
vation of  a  eukaryotic  rotary  motor,  conducted  with  the  intact  vacuolar  ATPase  from 
yeast  confirmed  subunit  rotation,  albeit  with  a  slightly  lower  torque  compared  to 
F-ATPase  [132]. 

All  single  molecule  experiments  using  direct  observation  of  actin  filaments, 
polystyrene  beads,  or  gold  nanorods  were  with  rotary  motor  complexes  functioning 
in  the  direction  of  ATP  hydrolysis  (Fig.  10.4a-c).  For  observation  of  single  mole- 
cules actively  synthesizing  ATP  driven  by  a  proton  gradient  and  membrane  poten- 
tial, a  different  experimental  setup  had  to  be  developed. 


10.4.4    Single  Molecule  Fluorescence  Energy  Transfer 
Spectroscopy 

Until  now,  it  was  shown  that  rotary  motor  ATPases  rotate  unidirectionally  (counter- 
clockwise when  seen  from  the  membrane  surface  towards  the  bottom  of  the  a3p3y 
complex)  and  while  there  was  little  doubt  that  directionality  is  reversed  when  the 
enzyme  switches  from  hydrolysis  to  synthesis  of  ATP,  the  change  in  the  direction  of 
rotation  needed  to  be  shown  experimentally.  This  was  ultimately  accomplished  using 
single  molecule  Forster  resonance  energy  transfer  (smFRET)  spectroscopy  of  lipo- 
some-reconstituted  E.  coli  ATP  synthase  labeled  with  donor  and  acceptor  fluoro- 
phores  at  stator  and  rotor  subunits,  respectively  [133]  (see  Fig.  10.4d).  ATP  synthesis 
was  driven  by  a  transient  transmembrane  pH  gradient  established  by  diluting  lipo- 
somes equilibrated  at  low  pH  into  a  buffer  of  basic  pH.  For  observation  of  single 
liposome-bound  ATP  synthase  molecules,  a  confocal  microscope  setup  was  used 
where  acceptor  fluorescence  is  recorded  while  liposomes  are  allowed  to  diffuse 
through  a  femtoliter  size  laser  focus.  During  the  resulting  "bursts"  of  fluorescence, 
three  levels  of  fluorescence  intensity  were  observed  and  it  could  be  shown  that  during 
membrane  potential-driven  ATP  synthesis,  the  order  of  the  fluorescence  levels  in  the 
bursts  was  reversed  from  the  direction  of  rotation  observed  for  ATP  hydrolysis. 
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A  similar  smFRET  setup  was  used  in  subsequent  experiments  to  show  that  rotation 
of  the  proteolipid  ring  occurred  in  steps  of  -36°  (the  E.  coli  enzyme  has  10  c  subunits  in 
the  proteolipid  ring)  [134].  ATP  hydrolysis-induced  stepping  of  the  proteolipid  rings 
was  subsequently  also  shown  using  high  speed  CCD-based  direct  observation  of 
enzyme-coupled  gold  nanorods  and  nylon  beads  for  E.  coli  F-  [135]  and  T.  thermophilus 
A/V-type  [27]  ATP  synthases,  respectively.  Recording  fluctuations  in  polarization  of 
gold  nanorods  induced  by  subunit  rotation  using  an  ultra  high  speed  CCD  camera 
revealed  a  transient  interaction  of  one  of  the  loops  of  the  a  subunit  with  the  cytoplasmic 
loops  of  the  proteolipid  ring  as  part  of  each  stepped  motion  of  the  ring  of  c  subunits. 
Such  transient  interaction  between  proteolipid  rotor  and  a  subunit  stator  may  serve  to 
bias  the  "Brownian  ratchet"  (see  below)  in  the  desired  direction  of  rotation  [135]. 


10.4.5   Rotational  Catalysis  and  Elastic  Energy  Coupling 

As  summarized  above,  both  ATPase  and  membrane  sectors  function  as  stepper 
motors,  with  120°  steps  for  the  central  rotor  of  the  ATPase  and  360° In  steps  for  the 
ion  channel  with  n  being  the  number  of  c  subunits  in  the  proteolipid  ring.  With  few 
exceptions,  the  number  of  c  subunits  in  the  ring  is  not  divisible  by  three,  resulting  in 
a  non-integer  ratio  of  the  number  of  ions  translocated  for  each  molecule  of  ATP 
synthesized  or  hydrolyzed.  The  much  smaller  rotational  steps  of  the  proteolipid  ring 
(36°  for  the  E.  coli  F- ATPase  c10  ring)  compared  to  the  120°  steps  of  the  central 
rotating  stalk  require  transient  storage  of  elastic  energy  by  one  or  several  compo- 
nents of  the  motor.  Single  molecule  experiments  conducted  with  the  E.  coli 
F-ATPase  revealed  that  it  is  the  y  subunit,  with  some  contribution  from  the 
C-terminal  domains  of  the  catalytic  p  subunits,  that  functions  in  elastic  energy  stor- 
age [136].  In  this  model,  the  energy  provided  by  the  translocation  of  the  first  two  (or 
three)  protons  serves  to  "wind  up"  the  y  subunit  before  a  third  (or  fourth)  proton 
results  in  a  120°  rotational  step  of  the  rotor  with  concomitant  opening  and  closing 
of  catalytic  sites.  The  same  study  also  suggested  that  the  b  subunit  dimer  is  rela- 
tively stiff,  indicating  that  the  stator  stalk  does  not  play  a  significant  role  in  transient 
energy  storage  [136].  However,  modeling  studies  conducted  with  the  bacterial 
A/V- ATPase  suggest  that  the  stator  stalks  have  to  be  flexible  to  adapt  to  the  different 
conformations  of  the  catalytic  subunits  [73]  and  it  remains  to  be  seen  whether  stiff- 
ness of  the  stator  stalk  is  a  conserved  feature  of  all  rotary  ATPases  or  limited  to 
bacterial  F-ATP  synthase. 


10.4.6    Theoretical  Considerations 

As  mentioned  above,  we  still  have  no  high  resolution  structure  of  the  proteolipid  ring- 
subunit  a  interface.  Critical  residues  involved  in  proton  transport  (in  E.  coli  F- ATPase) 
are  Asp61  in  subunit  c  and  Arg210  in  subunit  a.  The  difficulty  in  obtaining  crystals  for 
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Fig.  10.5  Mechanistic  model  of  the  rotary  ATPase  proton  turbine.  In  the  direction  of  ion  gradient- 
driven  ATP  synthesis,  protons  (or  sodium  ions)  enter  the  first  half  channel  in  subunit  a  and  bind  the 
ionized  form  of  a  carboxylate  located  in  the  middle  of  one  of  the  transmembrane  a  helices  of  subunit 
c  (Asp61  in  E.  coli  F-ATP  synthase).  Binding  of  the  proton  or  sodium  ion  neutralizes  the  negatively 
charged  carboxylate,  allowing  it  to  move  away  from  the  subunit  a-c  interface  into  the  hydrophobic 
proteolipid-lipid  interface.  This  movement  will  align  another  proton  or  sodium-bound  carboxyl 
group  with  the  second  half  channel,  allowing  its  ion  to  dissociate  and  diffuse  to  the  other  side  of  the 
membrane.  The  sidechain  of  a  critical  arginine  residue  in  subunit  a  acts  as  a  gate  to  prevent  that  ions 
move  directly  from  the  exit  of  the  first  to  the  entry  of  the  second  half  channel.  According  to  the 
"Brownian  ratchet"  model  of  rotary  ATPase-ion  channel  function  [120],  thermal  fluctuations  allow 
the  ring  to  overcome  local  minima  of  the  electrostatic  potential,  possibly  biased  by  a  transient  "tether" 
interaction  between  subunits  a  and  c  [139].  Illustration  adapted  from  ref.  [107] 


this  part  of  the  complex  may  be  explained  by  the  fact  that  the  tether  interactions  at  the 
subunit  a-c  interface  mentioned  above  cannot  be  too  tight.  The  surface  of  the  c  ring  is 
likely  covered  with  lipid  molecules,  a  greasy  surface  that  may  provide  lubrication  for 
the  a-c  interface  so  that  the  two  subunits  can  slide  past  each  other  several  hundred 
times  per  second  during  steady  state  turnover.  The  a-c  interface  has  been  studied  by 
disulfide  crosslinking  and,  based  on  these  data,  models  of  the  interface  have  been 
proposed  [137].  Early  on  it  was  postulated  that  subunit  a  contains  two  water  accessi- 
ble hemi  channels  that  are  interrupted  at  the  level  of  Asp61  and  Arg210  in  the  mem- 
brane [107]  (see  Fig.  10.5).  The  model  predicted  that  protons  (hydronium  ions)  would 
enter  from,  e.g.,  the  periplasmic  half  channel,  bind  the  carboxylate  of  Asp61,  and, 
after  being  carried  around  by  one  revolution  of  the  ring,  the  proton  would  be  released 
on  the  cytoplasmic  side  through  the  second  half  channel.  While  evidence  from  acces- 
sibility experiments  has  been  obtained  that  supports  a  two  half  channel  model  [108], 
details  of  the  proton  pathway  through  subunit  a  remain  to  be  determined.  As  men- 
tioned above,  the  2  four-a  helix  bundles  seen  in  the  cryo  EM  reconstruction  of 
T.  thermophilus  ATPase  have  been  proposed  to  act  as  the  two  half  channels  in  A-, 
A/V-type,  and  V- ATPases  [65].  F- ATPase  a  subunit,  however,  has  likely  only  five  TM 
segments,  but  since  the  presence  of  the  b  subunits  is  required  for  proton  translocation 
[138],  it  cannot  be  ruled  out  that  the  two  N-terminal  TM  helices  of  the  b  subunits 
contribute  to  the  two  half  channels  in  F- ATPase. 
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It  has  been  suggested  that  torque  generation  in  the  rotary  ATPase  proton  channel 
functions  according  to  a  "Brownian  ratchet"  mechanism,  where  thermal  fluctuations 
allow  the  ring  to  overcome  local  minima  in  the  electrostatic  potential  influenced  by 
the  positively  charged  Arg210  sidechain  and  the  carboxylate  of  Asp61  [139].  Once  a 
proton  (or  sodium)  is  picked  up  by  an  Asp61  carboxylate,  the  now  neutral  sidechain 
can  move  away  from  the  a-c  interface  into  the  lipid  bilayer,  moving  another  carbox- 
ylic  acid  into  the  interface  where  it  will  release  its  proton  to  the  cytoplasm.  This  mech- 
anism can  be  easily  envisioned  in  the  direction  of  proton  transport  along  a  pH  gradient 
while  in  the  direction  of  proton  pumping,  a  change  in  pk  of  Asp61  's  carboxyl  may  be 
required  to  allow  release  of  the  proton  to  a  more  acidic  periplasm. 


10.5    Current  Developments  and  Remaining  Challenges 

Past  biophysical  and  biochemical  investigations  of  the  rotary  motor  ATPase  have 
produced  a  wealth  of  structural  and  mechanistic  information,  especially  for  the  cata- 
lytic sectors.  Much  less  detail  is  available  for  the  membrane  portions  of  the  enzymes, 
in  particular  how  the  proteolipid  ring  interacts  with  the  (membrane-bound  part  of)  a 
subunit  during  rotational  catalysis.  Structural  studies  of  an  intact  rotary  motor 
ATPase  have  so  far  been  limited  to  crystal  structures  of  mitochondrial  Fx-proteo- 
lipid  ring  complexes  or  medium  resolution  cryo  electron  microscopy  maps  of  bacte- 
rial A/V- ATPase.  While  the  crystal  structures  provide  some  insight  into  the 
interaction  of  membrane  subunits  with  subunits  of  the  soluble  ATPase  sectors,  as  of 
now  only  the  cryo  EM  models  provide  a  glimpse  of  the  interface  of  the  proteolipid 
ring  and  the  a  subunit.  Resolving  this  interface  with  atomic  detail  will  be  absolutely 
required  for  elucidating  the  mechanism  by  which  protons  (or  sodium  ions)  are 
transported  from  one  side  of  the  membrane  to  the  other,  and  how  this  ion  transport 
generates  the  driving  force  for  subunit  rotation-driven  synthesis  of  ATP. 
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Chapter  11 

Biophysical  Approaches  to  Understanding 
the  Action  of  Myosin  as  a  Molecular  Machine 


Mihaly  Kovacs  and  Andras  Malnasi-Csizmadia 


Abstract  Many  of  the  concepts  of  biological  structure-function  relationships  were 
pioneered  in  muscle  research,  resulting  in  mechanistic  knowledge  spanning  from 
molecular  actions  to  macroscopic  phenomena.  Due  to  its  abundance  and  spatial 
organization,  the  actomyosin  system  powering  muscle  contraction  could  readily  be 
investigated  by  a  wide  variety  of  biophysical  methods,  and  also  provided  fertile 
ground  for  the  development  of  these  techniques.  For  decades,  muscle  actomyosin 
was  the  only  known  biological  motor  system.  It  was  later  discovered  that  muscle 
contraction  represents  a  highly  specialized  form  of  actomyosin-based  contractility. 
All  eukaryotic  cells  express  a  variety  of  myosin  isoforms,  which  drive  cellular  pro- 
cesses including  cell  division,  differentiation,  movement,  intracellular  transport, 
and  exo-  and  endocytosis.  In  this  chapter  we  discuss  how  various  biophysical  meth- 
ods have  been  used  to  elucidate  the  structural  and  functional  properties  of  the  acto- 
myosin system  and  the  physiological  processes  driven  by  its  motor  activity.  We 
provide  an  overview  of  techniques  applied  to  study  molecular  and  supramolecular 
features  of  diverse  myosin  motors  including  their  structure,  kinetics,  conforma- 
tional transitions,  force  generation,  assembly,  cooperation,  regulation,  and  the  link- 
age of  these  properties  to  cellular  and  physiological  functions. 

Keywords  Myosin  •  Actin  •  Structure  •  Kinetics  •  Mechanism  •  Muscle  •  Energy 
transduction  •  ATP  •  ATPase  •  Cytoskeleton  •  Method  •  Enzyme  •  Activation  • 
Allostery  •  Motility  •  Molecular  motors 
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11.1    Seminal  Discoveries  in  Muscle  Research  Contributed 
to  General  Concepts  of  Biology 

All  life  forms  exhibit  some  form  of  motility,  the  capability  to  perform  active  move- 
ment and  associated  work.  Motility  is  driven  by  proteins  called  molecular  motors. 
These  enzymes  can  convert  chemical  energy,  stored  most  commonly  in  the  anhy- 
dride bonds  of  ATP  or  a  proton  concentration  gradient,  to  mechanical  energy.  This 
direct  conversion  mechanism  is  fundamentally  different  from  that  of  man-made 
combustion  engines  in  which  the  free  enthalpy  liberated  in  a  chemical  reaction  is 
first  converted  to  heat,  and  heat  is  then  used  to  produce  work. 

The  most  obvious  macroscopic  manifestation  of  motility  is  muscle  action.  It  is  driven 
principally  by  two  kinds  of  protein,  actin  and  myosin.  For  a  long  time,  muscle  was  the 
only  system  for  studying  biological  motility.  Studies  on  muscle  initiated  the  field  of 
mechanobiochemistry,  which  paved  the  way  for  protein  mechanical  studies  also  in  vari- 
ous non-muscle  systems.  With  advances  in  cell  biological  and  molecular  genetic  tech- 
niques, a  wide  spectrum  of  non-muscle  myosins  as  well  as  tubulin-  and  nucleic 
acid-based  molecular  motors  were  discovered  and  mechanistically  characterized. 

Classical  advantages  of  muscle  research  include  (1)  the  abundance  of  proteins 
allowing  large-scale  homogeneous  preparations  for  biochemical  studies,  (2)  the 
high  level  of  spatial  organization  allowing  diffractional  and  microscopic  investiga- 
tions, and  (3)  a  macroscopic  contractile  phenomenon  readily  measurable  by 
mechanical  manipulation  in  physiological  experiments  [1]. 

Szent-Gyorgyi  introduced  the  idea  of  structure-function  relationships  in  biological 
systems,  and  chose  muscle  as  the  most  regularly  organized  specimen  to  study.  His  group 
was  the  first  to  purify  actin  and  myosin  in  isolation  [2].  They  made  actomyosin  threads 
from  the  isolated  proteins,  and  investigated  their  contractile  properties  in  correlation 
with  the  biochemical  features  of  protein  solutions  and  suspensions.  In  addition,  they 
described  the  cyclic  interaction  between  the  two  protein  components  and  discovered 
that  this  interaction  is  coupled  to  the  ATP  hydrolytic  cycle  [2].  Together  with  the  subse- 
quent discovery  of  the  actin-induced  activation  of  myosin's  ATPase  activity  by  Biro  and 
Szent-Gyorgyi  [3],  these  works  lay  the  foundations  for  the  concept  of  allosteric  activa- 
tion, which  was  later  explored  in  detail  in  structural  and  kinetic  studies  (discussed  in  the 
next  two  sections).  Decades  later,  the  combination  of  structural  and  kinetic  investiga- 
tions with  advanced  force  manipulation  techniques  revealed  a  special  mode  of  large- 
scale  allosteric  communication  in  which  the  activities  of  individual  catalytic  sites  are 
regulated  and  coordinated  by  external  forces  acting  on  proteins  [4-7]. 

The  principles  of  the  allosteric  regulation  of  specific  events  of  the  myosin  ATPase 
cycle  by  myosin's  interaction  with  actin  proved  to  be  generally  applicable  to  a  wide 
variety  of  motor-track  and  GTP-dependent  signaling  systems  [8-10].  In  this  frame- 
work, actin  can  be  viewed  as  a  nucleotide  exchange  factor  for  myosin,  accelerating 
the  release  of  hydrolysis  products.  Other  biological  phenomena  discovered  in  mus- 
cle research  include  the  principles  governing  supramolecular  assembly  leading  to 
formation  of  actin  and  myosin  filaments;  and  the  discovery  of  Ca2+  as  second  mes- 
senger, initiated  by  Weber  and  colleagues  [1,  11-13]. 
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11.2    Elucidation  of  Molecular  and  Supramolecular 
Structures 

H.E.  Huxley  and  Hanson,  in  parallel  with  A.F.  Huxley  and  Niedergerke,  proposed 
the  sliding  filament  model  of  muscle  contraction  based  on  their  interference,  phase 
contrast,  and  electron  microscopic  (EM),  and  low-angle  X-ray  diffraction  investiga- 
tions [14-16].  Among  several  competing  theories,  this  model  has  emerged  and 
remained  the  prevalent  conceptual  framework  for  muscle  action.  The  model 
explained  contraction  as  resulting  from  the  sliding  of  intercalated  thick  (myosin) 
and  thin  (actin)  filaments  past  each  other.  Sliding  was  proposed  to  be  powered  by 
the  action  of  crossbridge  structures  emanating  from  thick  filaments.  Crossbridges 
were  proposed  to  swing  during  the  force-generating  step  (powerstroke),  thus  exert- 
ing a  rowing-like  action  to  perform  mechanical  work. 

Thin  and  thick  filaments  are  assemblies  mainly  consisting  of  actin  and  myosin 
molecules,  respectively.  Actin  monomers  (G-actin)  have  a  molecular  weight  of 
42  kDa  and,  at  physiological  conditions,  polymerize  into  long,  helically  structured 
filaments  (F-actin).  The  first  X-ray  structures  of  G-actin  were  of  those  crystallized  in 
complex  with  DNase  I  and  other  actin  binding  proteins  [17-19].  These  structures 
revealed  that  the  actin  monomer  is  organized  into  two  domains  with  a  bound  nucleo- 
tide located  between  these  two  domains.  The  atomic  structure  of  the  actin  filament 
was  modeled  by  docking  the  G-actin  structures  into  the  cryo-EM  envelope  of  F-actin 
[20].  Recently,  more  refined  structures  were  published,  which  revealed  the  fine  details 
of  conformational  rearrangements  associated  with  actin  polymerization  [21,  22]. 

Myosin  can  be  extracted  from  myofibrils  at  high  ionic  strength.  The  molecular 
weight  of  the  myosin  holoenzyme  is  around  500  kDa,  as  determined  by  analytical 
ultracentrifugation  [1].  Electrophoretic  analysis  performed  under  denaturing  condi- 
tions showed  that  the  holoenzyme  consists  of  two  heavy  chain  subunits  with  a 
molecular  weight  around  220  kDa,  and  two  pairs  of  light  chains  with  molecular 
weights  around  17-20  kDa.  Historically,  the  light  chains,  which  belong  to  the 
calmodulin  family,  have  been  termed  as  essential  light  chain  (ELC)  and  regulatory 
light  chain  (RLC)  (Fig.  11.1a). 

Electron  micrographs  of  rotary  shadowed  myosin  molecules  showed  that  the 
molecule  consists  of  two  head-like  structures  (forming  the  crossbridges  in  mus- 
cle) and  an  elongated  rod  [23].  Limited  proteolytic  experiments  were  fundamen- 
tal in  resolving  the  "gross  anatomy"  of  the  myosin  holoenzyme.  Myosin  can  be 
proteolytically  digested  to  produce  heavy  and  light  meromyosin  fragments 
(HMM  and  LMM,  respectively)  (Fig.  11.1a)  [1,  24,  25].  HMM  contains  the 
heads  and  the  proximal  part  of  the  rod,  whereas  LMM  forms  the  distal  part  of  the 
rod.  The  actin  binding  property  and  ATPase  activity  reside  in  HMM,  whereas 
LMM  confers  the  capability  of  filament  formation.  Further  proteolysis  of  HMM 
liberates  myosin  heads  (sub fragment  1,  SI)  from  the  proximal  rod  fragment 
(subfragment  2,  S2)  (Fig.  11.1a).  SI  confers  catalytic  activity  and  can  bind  to 
actin,  whereas  the  S2  portion  holds  the  heads  together  and  provides  a  spacer  from 
rest  of  the  tail,  which  forms  the  filament  backbone  [26].  The  flexibility  of  myosin 
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Fig.  11.1  Overview  of  myosin  structure,  (a)  Structure  of  the  myosin  2  holoenzyme.  (b)  Schematic 
representation  of  a  bipolar  myosin  2  filament,  (c,  d)  Ribbon  diagrams  of  the  crystal  structure  of 
Dictyostelium  discoideum  myosin  2  motor  domain  in  the  up-lever  (c)  and  down-lever  (d)  confor- 
mations (based  on  PDB  structures  1VOM  and  1MMD,  respectively) 


at  the  S1-S2  and  HMM-LMM  junctions  was  detected  by  proteolysis,  EM,  and 
time-resolved  fluorescence  anisotropy  decay  experiments  [23,  27]. 

X-ray  diffraction,  optical  rotation,  sequence  analysis,  and  structural  modeling 
studies  revealed  that  the  myosin  rod  is  an  elongated  coiled-coil  made  up  of  two  long 
a-helical  segments,  mediating  the  dimerization  of  myosin  heavy  chains  [1].  The 
structure  is  held  together  by  a  medial  hydrophobic  stripe,  strengthened  by  charged 
residues  at  its  sides.  The  pattern  of  the  latter  elements  defines  a  staggered  arrange- 
ment of  coiled-coil  dimers  in  the  bipolar  myosin  filament  (Fig.  11.1b)  [28].  The 
double-headed  arrangement  and  filament  formation  capability  are  shared  by  a  large 
number  of  myosin  isoforms  classified  as  class  2  myosins,  which  are  present  in 
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various  forms  of  muscle  and  in  the  cytoplasm  of  non-muscle  cells  [29] .  The  size  and 
dynamics  of  myosin  filaments  varies  considerably  between  sarcomeric  (skeletal  and 
cardiac  muscle)  and  non- sarcomeric  (smooth  muscle  and  non-muscle)  myosin  2 
isoforms.  Sarcomeric  myosins  form  stable  filaments  comprising  several  hundred 
myosin  holoenzymes,  whereas  non- sarcomeric  ones  are  characterized  by  dynamic 
regulation  of  filament  assembly  and  disassembly  (Fig.  11.1b).  Vertebrate  non- 
muscle  myosin  2  minifilaments  consist  of  as  few  as  14  myosin  molecules  at  each 
pole,  as  shown  by  EM  both  in  vitro  [30]  and  in  vivo  [31]. 

Unlike  the  myosin  holoenzyme  and  the  isolated  myosin  rod,  the  catalytic  HMM 
and  SI  fragments  are  soluble  at  physiological  ionic  strength,  which  has  greatly 
facilitated  their  kinetic  investigation.  S 1  was  also  further  digested  to  produce  three 
heavy  chain  fragments  of  25,  50,  and  20  kDa.  These  fragments  were  first  considered 
as  individual  domains,  but  proved  to  be  inactive  [32,  33].  Two  reactive  sulfhydryl 
groups  (SHI  and  SH2)  located  in  the  20  kDa  segment  proved  to  be  useful  in  bio- 
physical experiments  as  the  attachment  of  spectroscopic  probes  to  these  groups 
enabled  a  wide  range  of  kinetic  and  structural  investigations  [34] . 

According  to  the  swinging  crossbridge  theory,  the  bulk  of  the  crossbridge  would 
have  to  rotate  during  force  generation.  In  early  electron  paramagnetic  (EPR)  spec- 
troscopic studies  on  muscle  fibers,  however,  probes  attached  to  SHI  failed  to  show 
such  a  reorientation  [34].  This  and  several  other  unresolved  questions  regarding 
force  generation  were  clarified  upon  the  publication  of  the  atomic  structures  of  the 
myosin  head.  The  first  structure,  solved  by  Rayment  and  coworkers,  was  that  of 
chicken  skeletal  muscle  SI  with  methylated  lysine  side  chains  [35].  The  structure 
was  tadpole-like  in  which  a  large  globular  motor  domain  (MD)  was  sequentially 
followed  by  a  long  a-helical  segment  of  the  heavy  chain,  to  which  the  ELC  and  RLC 
were  bound.  The  latter  structural  part  (involving  all  three  polypeptide  chains)  was 
termed  as  the  neck,  and  its  appearance  immediately  suggested  a  lever  function.  The 
previously  identified  proteolytic  fragments  turned  out  to  be  integral  subdomains  of 
SI,  which  were  linked  by  flexible,  protease- susceptible  surface  loops:  loop  1  con- 
necting the  25  and  50  kDa,  and  loop  2  connecting  the  50  and  20  kDa  fragments. 

As  depicted  in  the  atomic  structures  of  the  MD  shown  in  Fig.  11.1c,  d,  the  ATP 
binding  site  is  located  at  the  interface  of  the  25  and  50  kDa  subdomains.  The  50  kDa 
subdomain  is  divided  by  a  large  cleft;  thus  the  "upper"  50  kDa  (U50)  and  "lower" 
50  kDa  (L50)  subdomains  were  defined.  Parts  of  these  subdomains  were  postulated 
to  form  the  actin  binding  site.  The  20  kDa  fragment  contains  a  region  (termed  the 
converter)  connecting  the  nucleotide  binding  site  with  the  lever.  This  fragment  also 
contains  the  long  oc-helix  of  the  heavy  chain  portion  of  the  neck.  SHI  turned  out  to 
be  located  at  the  pivot  of  the  lever,  which  provided  an  explanation  for  the  insensitiv- 
ity  of  SHI -attached  spectroscopic  probes  to  lever  orientation.  Lever  movement 
could  be  followed  in  later  experiments  using  EPR  and  fluorescence  spectroscopic 
probes  attached  to  the  light  chains  [36-38]. 

Subsequently,  a  series  of  crystal  structures  of  a  recombinantly  expressed,  cata- 
lytically  active  myosin  2  MD  fragment  from  the  amoeba  Dictyostelium  discoi- 
deum,  and  those  of  molluscan  muscle  SI  fragments  were  crystallized  with  a 
variety  of  nucleotide  analogs  mimicking  different  intermediates  of  the  enzymatic 
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cycle  (Fig.  1 1.1c,  d)  [39-44].  Some  nucleotide  analogs  (including  ADP,  AMPPNP 
(adenylyl-imidodiphosphate),  and  ADP.BeFJ  induced  a  conformation  that  closely 
resembled  the  previously  described  chicken  SI  structure  [40,  41,  43].  In  contrast, 
other  analogs  (including  ADP.A1F4  and  ADP.V04)  induced  a  conformation  in 
which  the  C-terminal  part  of  the  MD,  which  forms  the  base  of  the  lever,  pointed 
in  a  direction  70°  different  from  that  seen  in  the  other  structures  [40,  42,  43]. 
These  findings  were  suggestive  of  the  lever  swing  expected  based  on  previous 
results.  Some  nucleotide  analogs  (chiefly  ADP.BeFJ  were  able  to  induce  both 
states,  which  implied  the  reversibility  of  the  lever  swing  in  the  absence  of  actin 
(see  also  in  next  section).  However,  the  assignment  of  the  detected  structural 
states  to  the  functional  states  of  the  mechanochemical  cycle  remained  to  be 
addressed  by  kinetic  and  spectroscopic  studies. 


11.3    Kinetic  Resolution  of  Structural  States 


The  model  proposed  by  Lymn  and  Taylor  in  1971  lay  the  groundwork  for  the  coupling 
of  the  ATP  hydrolytic  and  mechanical  cycles  of  actomyosin  and  placed  the  major  (then 
postulated)  structural  states  of  actomyosin  in  the  first  kinetic  framework  (Fig.  1 1.2)  [45]. 
The  cycle  was  established  by  transient  kinetic  studies  monitoring  the  interaction  of 
F-actin,  soluble  myosin  fragments  (HMM  and  SI),  and  nucleotide.  The  cycle  is  usually 
described  starting  from  the  strongly  actin-bound,  nucleotide-free  myosin  (rigor)  state. 
The  rapid  and  high-affinity  binding  of  ATP  to  the  myosin  head  causes  a  drastic  weaken- 
ing of  the  acto-Sl  interaction  and  the  dissociation  of  the  two  proteins,  as  evidenced  by  a 
concomitant  decrease  in  light  scattering.  The  hydrolysis  of  ATP  was  found  to  occur 
mainly  in  the  actin-detached  state.  A  burst  in  phosphate  (Pi)  liberation  from  ATP, 
detected  in  quenched-flow  transient  kinetic  experiments,  indicated  that  ATP  hydrolysis 
precedes  the  rate-limiting  step  of  the  chemical  cycle.  Although  not  directly  evidenced 
by  experiments  at  the  time,  ATP  hydrolysis  was  proposed  to  be  linked  to  the  priming  of 
the  myosin  head.  The  myosin.ADP.Pj  products  complex  was  shown  to  rebind  to  actin. 
The  release  of  the  hydrolysis  products  was  thus  proposed  to  be  coupled  to  the  strength- 
ening of  the  actomyosin  interaction  and  the  force-generating  powerstroke  step,  during 
which  the  crossbridge  swings  back  to  its  rigor  conformation. 

A  more  detailed  kinetic  framework  of  the  SI  ATPase  cycle  and  the  conforma- 
tional changes  occurring  in  the  absence  of  actin  was  proposed  some  years  later  by 
Bagshaw  and  Trentham  [46,  47].  In  their  detailed  transient  kinetic  analysis  they 
utilized  the  increase  in  the  intrinsic  tryptophan  (Trp)  fluorescence  of  rabbit  skeletal 
muscle  SI  occurring  upon  nucleotide  binding  and  ATP  hydrolysis.  Their  work 
revealed  that  the  binding  process  occurs  in  two  steps:  a  rapid  formation  of  a  weakly 
bound  myosin-nucleotide  collision  complex  is  followed  by  a  conformational  transi- 
tion leading  to  the  strengthening  of  the  complex,  associated  with  an  increase  in  Trp 
fluorescence.  ATP  hydrolysis  was  associated  with  a  further  fluorescence  enhance- 
ment, indicating  the  formation  of  the  primed  (up-lever)  crossbridge  state.  Quenched- 
flow  experiments  as  well  as  isotope  exchange  studies  (i.e.,  the  incorporation  of 
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Fig.  11.2  Mechanochemical  cycle  of  the  actomyosin  motor.  The  mechanism  shown  incorporates 
the  Lymn-Taylor  model  in  conjunction  with  major  conceptual  advances  gained  from  recent  struc- 
tural and  kinetic  investigations.  Myosin  and  actin  are  shown  in  green  and  blue,  respectively.  During 
the  working  cycle,  ATP  binding  to  the  myosin  head  dissociates  the  strongly  bound  actomyosin 
rigor  complex  {lowest  panel).  During  the  recovery  step  {left  two  panels),  which  occurs  in  ATP- 
bound  myosin,  the  myosin  lever  moves  from  a  "down"  to  an  "up"  (primed)  position.  Note  the 
change  in  lever  orientation  relative  to  the  motor  domain.  In  the  actin-detached  states  {upper  left 
four  panels),  the  myosin  head  (motor  domain  plus  lever)  is  shown  in  several  different  orientations 
(connected  by  double-sided  arrows)  to  indicate  its  free  rotation  about  a  flexible  joint  that  connects 
it  to  the  distal  part  of  the  molecule  (and  the  thick  filament  in  muscle).  The  hydrolysis  of  ATP  to 
ADP  and  Pi  (inorganic  phosphate)  occurs  only  in  the  up-lever  state  {upper  left  panel).  The  post- 
hydrolytic  up-lever  complex  {upper  middle  panel)  can  continue  the  cycle  in  two  pathways.  If  the 
lever  swings  back  to  a  down  position  when  the  head  is  detached  from  actin  (futile  lever  swing, 
middle  panels),  the  ATP  hydrolysis  cycle  is  completed  without  work  production.  In  order  to 
undergo  an  effective  powerstroke  leading  to  force  generation,  the  head  must  rebind  to  actin  {upper 
right  panel)  before  the  lever  swing  (up-to-down  movement,  right  two  panels).  Following  the  lever 
swing  and  the  release  of  hydrolysis  products  (Pi  and  ADP),  the  myosin  head  binds  a  new  ATP 
molecule.  Note  that  this  scheme  does  not  indicate  changes  in  actin  affinity  or  motor  domain  struc- 
ture, and  thus  it  does  not  discriminate  between  alternative  pathways  of  the  powerstroke  discussed 
in  the  text.  Reproduced  with  permission  from  ref.  [58] 

multiple  180  atoms  from  water  into  the  liberated  Pi)  revealed  the  reversibility  of  the 
hydrolysis  step  [48].  The  quasi-irreversible  release  of  Pi  was  proposed  to  be  the  rate- 
limiting  step  in  the  absence  of  actin.  ADP  release  was  shown  to  be  essentially  the 
reversal  of  the  two-step  ATP  binding  process. 


348 


M.  Kovacs  and  A.  Malnasi-Csizmadia 


Almost  three  decades  later,  a  series  of  kinetic  and  spectroscopic  studies  from 
various  laboratories  provided  an  extension  of  the  Bagshaw-Trentham  and  Lymn- 
Taylor  models  by  the  determination  of  the  precise  correspondence  between  struc- 
tural and  kinetic  states.  Advances  in  molecular  genetic  techniques,  recombinant 
protein  expression,  and  mutagenesis  allowed  the  precise  placement  of  spectroscopic 
probes  into  specific  locations  of  the  myosin  head  [49-55].  The  major  source  of  Trp 
fluorescence  changes  upon  nucleotide  binding  and  ATP  hydrolysis  was  assigned  to 
conserved  Trp  residues  located  at  the  nucleotide  binding  site  and  the  so-called  relay 
loop,  respectively.  The  relay  loop  is  located  at  the  interface  of  the  L50  and  converter 
subdomains  (Fig.  1 1 .  lc,  d).  Trp  fluorescence  experiments  showed  that  lever  priming 
and  ATP  hydrolysis  are  distinct  but  coupled  steps  as  hydrolysis  can  take  place  only 
in  the  primed  (up-lever)  myosin  state  [54,  56].  These  findings  were  in  line  with 
structural  results  showing  that  the  catalytic  residues  of  the  active  site  are  in  place  for 
catalysis  only  in  the  up-lever  conformation  [40,  42]. 

Trp  signals  originating  from  the  relay  loop  allowed  the  experimental  verification 
of  the  earlier  concept  that  the  up-to-down  lever  movement  occurring  after  hydroly- 
sis in  the  myosin. ADP.Pj  complex  is  a  distinct  step  preceding  Pi  release  [47,  57].  In 
addition,  it  was  shown  that,  in  the  absence  of  actin,  the  rate-limiting  step  of  the 
enzymatic  cycle  is  the  up-to-down  lever  movement  and  not  the  actual  release  of  Pj. 
The  up-to-down  lever  movement  is  markedly  accelerated  by  actin,  which  was 
pointed  out  as  the  key  phenomenon  to  ensure  that  lever  priming  (down-to-up  move- 
ment) and  the  force-generating  powerstroke  (up-to-down  movement)  occur  in  actin- 
detached  and  attached  states,  respectively.  This  feature  leads  to  a  kinetic  pathway 
selection  mechanism  enabling  efficient  force  generation,  and  explains  why  the  pow- 
erstroke can  start  from  the  myosin.ADP.Pi  state,  which  has  the  lowest  actin  affinity 
among  all  intermediates  of  the  enzymatic  cycle  (Fig.  11.2)  [58-60]. 

Parallel  advances  in  structural  and  kinetic-spectroscopic  investigations  also 
revealed  many  of  the  fine  details  of  the  allosteric  linkage  between  the  three  most 
important  functional  parts  of  the  myosin  head:  the  actin  and  nucleotide  binding  sites 
and  the  lever.  The  ATPase  active  site  contains  three  conserved  loops  termed  the 
P-loop,  switch- 1,  and  switch-2.  In  the  absence  of  actin,  switch-2  and  the  lever  show 
coupled  movement.  An  open  (non-catalytic)  switch-2  conformation  is  associated 
with  a  down  lever.  Closure  of  switch-2,  and  thus  the  acquisition  of  catalytic  compe- 
tence, was  found  to  coincide  with  lever  priming  (down-to-up  movement).  Kinetic 
studies  revealed  that  lever  priming  is  a  rapid  and  reversible  step,  which  is  a  prereq- 
uisite for  the  oncoming,  relatively  slower  chemical  step  of  ATP  hydrolysis  [54,  56]. 

The  concept  of  weak  and  strong  actin  binding  states  of  myosin  emerged  from 
kinetic  and  spectroscopic  studies.  Besides  light  scattering,  the  most  useful  signal 
used  in  these  studies  was  site-specific  labeling  of  actin  by  cysteine-reactive  pyrene 
dyes  at  residue  C374  close  to  the  actin  C-terminus  [61].  Quenching  of  pyrene  fluores- 
cence occurs  upon  the  formation  of  the  strongly  bound  actomyosin  complex  [62,  63]. 

The  actin  binding  region  of  the  myosin  head  involves  several  structural  elements 
contributed  from  both  sides  of  the  large  cleft  separating  the  U50  and  L50  subdo- 
mains [64] .  The  incomplete  fitting  of  the  available  atomic  structures  of  the  myosin 
head  and  actin  into  cryo-EM  envelopes  of  the  actomyosin  rigor  complex  implied 
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that  the  cleft  must  undergo  closure  in  order  to  adopt  a  strongly  actin-bound  state 
[65].  Structural  studies  also  revealed  that  a  class  5  myosin  is  able  to  adopt  a  closed- 
cleft  structure  even  in  the  absence  of  actin,  which  can  be  well  fitted  into  cryo-EM 
maps  of  the  rigor  complex  [66,  67].  Correspondingly,  it  was  found  that  this  myosin 
isoform  binds  actin  in  a  rapid,  quasi-diffusion-controlled  manner  [68].  The  kinetics 
of  the  pyrene  fluorescence  quench  occurring  upon  actin  binding  by  myosin  5  did  not 
show  a  saturating  tendency  with  increasing  actin  concentration.  Later  it  was  found 
that  muscle  myosin  2  isoforms  from  various  molluscan  species  are  also  able  to 
adopt  a  rigor-like  conformation  in  the  absence  of  actin,  and  bind  to  actin  rapidly  in 
a  manner  similar  to  that  observed  in  myosin  5  [69]. 

Coupling  between  the  conformational  changes  in  myosin's  actin  and  nucleotide 
binding  regions  leads  to  an  antagonistic  relationship  (negative  thermodynamic  cou- 
pling) between  actin  and  nucleotide  binding  affinities.  This  coupling  was  character- 
ized in  detail  and  kinetically  resolved  by  using  a  variety  of  intrinsic  and  extrinsic 
fluorescent  and  EPR  probes  [51,  70-72].  In  addition,  it  was  discovered  that  the  con- 
formation of  the  switch- 1  loop  of  the  active  site  is  linked  to  changes  in  actin  affinity, 
probably  via  cleft  movement  [73,  74].  An  open  switch- 1  state  is  associated  with  low 
nucleotide  affinity  and  a  closed-cleft  conferring  high  actin  affinity.  Switch- 1  closure 
upon  nucleotide  binding  leads  to  cleft  opening  and  weakening  of  myosin's  actin  affin- 
ity. The  results  also  revealed  the  role  of  the  y-phosphate  of  ATP  in  switch  closure,  as 
the  binding  of  ADP  to  actomyosin  induces  limited  switch- 1  closure. 

As  discussed  above,  the  processes  of  nucleotide-induced  actomyosin  dissocia- 
tion and  the  subsequent  conformational  changes  of  actin-detached  myosin  consti- 
tute fairly  well-described  segments  of  the  mechanochemical  cycle.  However,  the 
functionally  most  intriguing  part  of  the  mechanism,  i.e.,  force  generation  via  the 
powerstroke,  has  remained  elusive,  mainly  due  to  the  low  abundance  and  kinetically 
inaccessible  nature  of  its  key  intermediates.  A  comprehensive  model  describing 
myosin's  structural  transitions  leading  to  the  powerstroke  was  set  forth  by  Geeves 
and  Holmes  [64].  As  a  cardinal  feature,  the  model  proposes  that  the  actin-attached 
powerstroke  (up-to-down  lever  movement)  occurs  via  a  structural  pathway  that  is 
markedly  different  from  that  of  the  well-characterized  actin-detached  lever  priming 
process  (down-to-up  movement).  In  the  actin-attached  pathway,  the  lever  swing 
starts  from  a  closed-cleft,  high  actin-affinity  "top-of-powerstroke"  state  that  has  a 
closed  active  site  (switches  1  and  2)  and  the  lever  arm  in  the  "up"  orientation.  The 
lever  swing  then  occurs  without  switch-2  opening  (in  contrast  to  the  situation  in  the 
absence  of  actin),  but  switch- 1  opens  to  release  the  hydrolysis  products.  This  propo- 
sition challenges  the  long-standing  "back  door"  hypothesis,  which  stated  that,  fol- 
lowing ATP  hydrolysis,  Pj  releases  via  a  route  different  from  that  of  ATP  entry  [75]. 
The  model  of  Geeves  and  Holmes  also  embraced  the  idea  that  the  strengthening  of 
the  actomyosin  interaction  is  associated  with  the  movement  of  the  P-loop  and  the 
twisting  of  the  central  p- sheet  of  the  MD.  This  p- sheet,  termed  as  the  "transducer" 
together  with  associated  elements,  has  been  proposed  to  act  as  an  energy- storing 
torsional  spring  within  the  MD  (Fig.  11.1c,  d)  [68,  76]. 

Force  production  is  generally  conceived  to  occur  in  strongly  bound  actomyosin 
complexes.  In  line  with  this,  the  above  model  implies  that  the  strengthening  of  the 
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actomyosin  interaction  precedes  the  actual  swing  of  the  lever.  However,  the  limited 
available  evidence  does  not  exclude  that  the  force  generation  process  could  occur 
via  parallel  pathways — even  in  ones  in  which  the  lever  swing  occurs  in  weakly 
actin-attached  myosin.  The  subsequent  strengthening  of  the  actomyosin  interaction 
could  thermodynamically  stabilize  the  post-powerstroke  state  [58].  The  force 
dependence  of  the  transition  between  weakly  and  strongly  actin-bound  myosin 
complexes  can  thus  markedly  influence  the  fluxes  conveyed  by  the  alternative  path- 
ways. This  constitutes  one  of  the  key  unresolved  questions  regarding  force  genera- 
tion (see  also  in  the  next  section). 

The  physical  time  scale  of  the  kinetic  transitions  occurring  during  the  working 
cycle  (milliseconds  to  seconds)  is  several  orders  of  magnitude  slower  than  those 
accessible  by  computational  simulation  methods  that  could  describe  structural  path- 
ways, energy  landscapes,  and  transition  states.  However,  significant  advances  in 
these  approaches  including  conjugate  peak  refinement  (CPR),  normal  mode  analy- 
sis (NMA),  umbrella  methods,  and  molecular  dynamic  simulations  have  enabled 
the  description  of  structural  trajectories  of  nucleotide-induced  cleft  opening,  lever 
priming,  and  ATP  hydrolysis  [77-80]. 


11.4    Force  Generation  and  Motility 

Mechanoenzymes  confer  the  specific  feature  that  the  enzymatic  reaction  is  linked  to 
mechanical  action,  which  can  be  followed  and  analyzed  by  force  manipulation  and 
particle  tracking  techniques.  Muscle  produces  a  macroscopic  mechanical  response, 
which  was  investigated  for  many  decades  on  intact  muscles  and  muscle  fibers. 
These  techniques  generally  measure  the  mechanical  response  of  the  specimen  to  the 
addition  of  chemical  agents  (by  applying  changes  in  solution  conditions)  or  on 
mechanical  stimulus  (by  applying  rapid  stretch  or  release).  Demembranated 
(skinned)  fibers  with  an  exposed  contractile  apparatus,  produced  first  by  Szent- 
Gyorgyi,  allow  the  application  of  rapid  changes  in  chemical  conditions.  Mechanical 
manipulation  (length-jump)  experiments  have  provided  important  insights  into  the 
kinetics  and  energetics  of  muscle  contraction  [81].  A  crucial  aim  of  these  studies  is 
to  establish  the  linkage  between  macroscopic  parameters  of  force  production  and 
the  underlying  molecular  mechanisms.  For  instance,  the  linkage  of  the  force-gener- 
ating step  to  enzymatic  product  release  steps  has  been  a  matter  of  great  controversy. 
Based  on  the  dependence  of  the  mechanical  response  of  muscle  fibers  on  solute 
(mainly  Pi)  concentrations,  the  majority  of  groups  argue  that  force  generation  occurs 
before  Pj  release  [82-85].  However,  other  workers  propose  that  the  force-generating 
step  is  a  conformational  transition  between  two  ADP-bound  states,  and  thus  it 
occurs  after  Pi  has  released  from  the  myosin  head  [86,  87]. 

It  was  discovered  as  early  as  1923  by  Fenn  that  the  heat  output  of  muscle 
increases  when  it  is  allowed  to  shorten  against  a  load,  as  compared  to  the  situa- 
tion when  it  only  holds  tension  (isometric  contraction)  [88].  Accordingly, 
force-velocity  relationships  of  contracting  muscles  and  fibers  define  an 
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optimum  for  power  output,  generally  around  1/3  of  the  maximal  (unloaded) 
shortening  velocity  [1].  Based  on  current  knowledge  of  the  molecular  mecha- 
nisms, the  Fenn  effect  is  thought  to  arise  from  the  load  dependence  of  the  kinet- 
ics of  multiple  key  steps  of  the  actomyosin  ATPase  cycle.  As  a  result,  lower 
loads  will  be  associated  with  more  rapid  ATPase  activity  and  shorter  periods  of 
actin  attachment,  whereas  an  increase  in  resistive  load  will  prolong  the  actin 
attachment  lifetime  and  slow  down  the  enzymatic  cycle  of  myosin  heads  [5,  7, 
89-91].  Fiber  techniques  have  provided  estimates  of  1-10  pN  for  the  unitary 
isometric  force  produced  by  a  single  myosin  head  and  values  around  7  nm  for 
the  unitary  displacement  occurring  during  a  single  powerstroke  [92,  93].  These 
values  were  refined  by  single  molecule  techniques  more  directly  measuring  uni- 
tary displacement  and  force  generation  (see  below). 

The  swinging  lever  hypothesis  implies  that  force  generation  requires  the  move- 
ment of  a  lever  whose  length  will  directly  determine  the  unitary  displacement.  In 
vitro  motility  assays  were  the  first  to  allow  quantitative  measurement  of  motile 
properties  of  experimental  systems  assembled  from  isolated  protein  components 
[94-97] .  The  most  common  form  of  these  assays  images  and  measures  the  velocity 
of  the  gliding  of  fluorescently  labeled  actin  filaments  over  a  myosin-coated  surface, 
monitored  by  fluorescence  microscopy.  In  vitro  motility  assays  also  allowed  the 
mechanical  properties  on  non-muscle  myosins  to  be  studied,  for  which  macroscopic 
forces  would  be  difficult  to  assess  in  vivo. 

The  combination  of  in  vitro  motility  assays  with  the  genetic  manipulation  of 
lever  length  provided  the  first  solid  support  for  the  lever  arm  hypothesis  [98,  99]. 
Myosin  constructs  of  varying  lever  length  could  be  produced  either  by  modification 
of  the  number  of  light  chain  binding  regions  in  the  neck  region  or  by  appending 
artificial  levers  to  the  MD.  The  in  vitro  actin  gliding  velocity  was  found  to  increase 
in  proportion  with  lever  length.  Further  support  for  the  lever  theory  came  from 
experiments  in  which  the  orientation  of  the  lever  was  redesigned  by  genetic  manipu- 
lation. Myosin  constructs  with  a  reverse-oriented  lever  exhibited  a  reversal  in  the 
direction  of  the  movement  of  actin  filaments  in  the  in  vitro  motility  assay  [100]. 

A  further  great  technical  breakthrough  in  the  investigation  of  molecular  motility 
was  the  application  of  the  optical  trap,  which  allowed  direct  measurement  of  forces 
produced  by  single  molecules  [101-104].  In  this  technique,  a  focused  laser  beam  is 
used  to  hold  a  micrometer- size  bead  in  position,  which  allows  the  measurement  of 
molecular  forces  pulling  the  bead  out  of  position.  A  widely  used  arrangement  for 
the  measurement  of  actin-myosin  interactions  is  a  three-bead  dumbbell  assay  in 
which  an  actin  filament  is  spanned  between  two  beads,  and  a  single  myosin  mole- 
cule is  attached  to  a  third  bead  serving  as  a  pedestal.  This  arrangement  also  allows 
the  application  of  rapid  external  force  feedback  to  determine  the  effect  of  external 
load  on  the  mechanical  response  of  single  myosin  molecules.  Besides  the  measure- 
ment of  the  main  force-producing  powerstroke  step  [105],  these  assays  were  also 
useful  in  determining  the  load-dependent  kinetics  of  ADP  release  of  various  myosin 
isoforms  [7,  89-91,  106].  This  feature  turned  out  to  be  an  important  mechanism 
regulating  the  actin  attachment  lifetime  of  myosin  heads,  which  determines  their 
sliding  velocity  and  tension  bearing  properties. 
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Advances  in  the  visualization  of  single  molecules  by  fluorescence  microscopic 
techniques  have  also  been  valuable  in  revealing  motile  mechanisms.  To  observe 
single  molecules,  one  of  the  most  widely  applied  techniques  is  total  internal  reflec- 
tion fluorescence  (TIRF)  microscopy.  This  technique  allows  selective  excitation  of 
fluorophores  in  the  vicinity  of  the  microscope  slide  via  an  evanescent  field,  thereby 
eliminating  background  fluorescence  originating  from  the  bulk  of  the  sample. 
Single  enzymatic  cycles  were  observed  by  following  the  interaction  of  fluorescently 
labeled  nucleotides  with  surface-attached  single  myosin  molecules  [107,  108]. 

For  the  investigation  of  non-muscle  myosin  motility,  the  most  widely  applied 
TIRF  arrangement  involves  the  attachment  of  actin  filaments  to  the  slide  surface  via 
biotin-avidin  bridges  [109].  This  arrangement  allows  monitoring  of  the  movement 
of  fluorescently  labeled  single  myosin  molecules  on  the  actin  tracks  if  they  are  pro- 
cessive,  i.e.,  able  to  perform  a  series  of  steps  along  the  actin  filament  before  detach- 
ment. Movement  observed  in  this  assay  is  generally  considered  as  direct  evidence 
of  the  processive  nature  of  a  given  type  of  motor  protein. 

A  further  sophistication  of  the  TIRF  assay  is  FIONA  (fluorescence  imaging  with 
1  nm  accuracy),  which  allows  precise  determination  of  the  position  of  a  single  fluoro- 
phore  by  applying  two-dimensional  Gaussian  fits  to  the  spatial  distribution  of  fluores- 
cence emission  intensity  [1 10].  Sizes  and  durations  of  individual  steps  during  processive 
runs  of  various  myosin  isoforms  have  been  resolved  by  FIONA  [111-11 4]. 

Synthesis  of  the  knowledge  gained  in  biochemical  and  molecular  mechanical 
investigations  may  provide  clues  regarding  an  important  theoretical  aspect  of  motor 
protein  action,  i.e.,  whether  force  generation  is  based  on  lever  strain  or  a  Brownian 
ratchet- like  mechanism  [115].  In  the  lever  strain  model,  one  ATPase  cycle  leads  to 
strictly  one  lever  swing  event  and  coupled  translocation.  In  contrast,  a  Brownian 
ratchet-like  motor  can  freely  fluctuate  between  pre-  and  post-powerstroke  states. 
This  fluctuation  is  rectified  by  the  nucleotide  hydrolytic  cycle,  which  imposes  a  bias 
as  different  lever  conformations  may  be  preferred  for  track  binding  and  dissocia- 
tion. Ratchet  mechanisms  can  thus  act  at  lower  coupling  between  the  chemical 
cycles  and  translocation.  The  current  thinking  on  actomyosin  action  embraces  both 
lever-  and  the  ratchet- type  concepts.  This  conceptual  framework  has  been  applied 
also  to  other  systems  including  DNA-based  motor  proteins. 


11.5    Correspondence  Between  Molecular  Properties 
and  Physiological  Functions 

The  first  so-called  unconventional  myosin  (myosin  1)  was  discovered  in  1973 
[116].  This  myosin  does  not  contain  sequences  for  dimerization  and  thus  adopts  a 
single-headed,  nonfilamentous  structure.  The  large  amount  of  sequence  informa- 
tion gained  in  the  subsequent  decades  revealed  the  existence  of  a  multitude  of 
myosin  isoforms  whose  subunit  composition  and  domain  structure  showed  wide 
variations  in  line  with  their  supramolecular  assemblies,  regulatory  mechanisms, 
and  physiological  functions  [29,  117].  The  activity  of  myosins  is  required  for  a 
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variety  of  life  processes  including  muscle  contraction,  cell  migration,  division, 
sensory  functions,  membrane  trafficking,  and  formation  of  cellular  protrusions 
such  as  brush  borders  and  filopodia  [117].  The  myosin  superfamily  is  currently 
classified  into  35  classes  [118].  The  MD  has  proved  to  be  the  most  conserved  part 
of  myosin  in  terms  of  its  sequence  and  structure,  and  therefore  this  domain  was 
used  as  a  basis  for  classification.  Besides  the  MD,  most  myosins  contain  neck  and 
tail  domains.  The  neck  is  usually  centered  along  an  elongated  oc-helix  of  the  heavy 
chain  containing  varying  numbers  of  conserved  sequence  elements  called  IQ 
motifs,  each  of  which  can  bind  calmodulin  or  a  calmodulin-family  light  chain.  It 
was  also  proposed  that  myosin  necks  may  be  extended  by  stable  charged  single 
a-helices  with  no  associated  light  chains  [119-121].  The  tails  of  myosins  from  dif- 
ferent classes  contain  various  effector  and  partner  binding  domains  and,  in  some 
classes,  induce  heavy  chain  dimerization. 

The  size  and  stability  of  filamentous  assemblies  varies  widely  between  different 
myosin  2  isoforms  [1,  29].  Sarcomeric  (skeletal  and  cardiac)  muscle  myosin  2  iso- 
enzymes form  large  and  stable  thick  filaments.  Regulation  of  the  action  of  these 
myosins  by  Ca2+  is  generally  mediated  by  actin-associated  protein  complexes 
(mainly  troponin  and  tropomyosin)  by  occluding  or  exposing  the  myosin  binding 
sites  on  the  actin  filament  [13,  122].  In  smooth  muscle  and  non-muscle  myosin  2, 
filament  assembly  and  myosin  activity  are  dynamically  regulated  via  myosin  phos- 
phorylation [123].  In  the  off-state  with  unphosphorylated  RLC,  the  two  heads  of 
these  myosins  adopt  an  asymmetrical  arrangement  in  which  the  actin  binding  site  of 
one  head  interacts  with  the  nucleotide  binding  site  of  the  other  [124].  Other  myosin 
2  isoforms,  including  those  from  molluscan  muscle,  are  regulated  by  direct  binding 
of  Ca2+  ions  to  the  ELC  [125].  Ca2+  binding  also  regulates  the  activity  of  many 
unconventional  myosins  [117]. 

Heavy  chain  dimerization  is  a  prerequisite  for  the  processive  walking  of  cy to- 
skeletal  transporters  acting  as  single  holoenzymes.  Dimerization  of  myosin  5  was 
obvious,  but  that  of  other  transporters  such  as  myosins  6  and  10  has  been  a  conten- 
tious issue  [119,  126,  127].  In  myosin  6,  the  importance  of  a  cargo-induced  dimer- 
ization mechanism  has  been  pointed  out  [126].  Myosin  6  displays  other  peculiar 
adaptations  including  reverse  directionality  (i.e.,  movement  towards  the  minus  end 
of  actin  tracks)  dictated  by  a  class-specific  insert  at  the  base  of  its  lever,  which  also 
has  a  reversibly  extendable  domain  [128,  129].  Membrane  attachment  can  also  reg- 
ulate the  targeting  and  supramolecular  organization  of  various  unconventional  myo- 
sins [117].  Another  intriguing  adaptation  is  that  other  motor  proteins  or  actin 
bundles  can  induce  processive  motility  of  some  myosin  isoforms  that  were  previ- 
ously shown  to  be  non-processive  on  bare  actin  filaments  [130,  131]. 

Assemblies  of  multiple  motor  units  into  dimers  or  filaments  raise  the  possibility 
of  large-scale  allosteric  communication  between  motor  units  via  exerting  forces  on 
each  other.  It  has  long  been  known  from  muscle  physiological  studies  that  rapid 
pulling  or  release  of  activated  muscle  fibers  can  produce  partial  synchronization  of 
myosin  heads,  implying  a  load-dependent  mechanism.  Biochemical  and  single  mol- 
ecule experiments  have  revealed  the  importance  of  such  mechanisms  in  most  fila- 
mentous myosins  [117].  In  the  case  of  double-headed  transporters,  this  form  of 
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communication  may  keep  the  mechanochemical  cycles  of  the  two  heads  out  of 
phase.  This  mechanism  enables  efficient  processive  movement  by  preventing  simul- 
taneous actin  detachment  of  both  heads  [89].  In  myosin  5  performing  single  mole- 
cule motility,  the  length  of  the  neck  also  determines  the  optimal  arrangement  of 
myosin  heads  to  reach  the  next  subunit  on  the  actin  track  [109].  Recent  advances  in 
time-resolved  atomic  force  microscopy  (AFM)  have  enabled  the  direct  observation 
of  the  processive  walking  of  myosin  5  molecules  along  actin  filaments  [132]. 

The  general  mechanokinetic  framework  set  forth  by  Lymn  and  Taylor  has  proven 
to  be  applicable  to  most  myosins,  but  there  is  a  large  variation  between  isoforms  in 
the  magnitude  and  ratio  of  the  individual  rate  and  equilibrium  constants  of  transi- 
tions [133].  Simplistically,  the  mechanochemical  cycle  time  can  be  divided  into  the 
lifetimes  of  actin-attached  (7on)  and  actin-detached  (tof£)  states,  governed  by  the  steps 
of  the  ATPase  cycle,  which  are  linked  to  changes  in  actin  interaction.  The  important 
concept  of  duty  ratio  (or  duty  cycle,  r)  can  be  defined  by  the  fraction  of  cycle  time 
spent  in  actin-bound  states:  r  will  thus  equal  ton/(ton+toe).  In  the  ergodic  approxima- 
tion, this  will  also  equal  the  fraction  of  motor  units  (heads)  bound  to  actin  in  a  popu- 
lation at  a  given  time  point  during  steady- state  cycling. 

It  is  characteristic  of  most  myosins  that  ATP  binding  causes  rapid  dissociation 
from  actin,  and  hydrolysis  occurs  in  the  actin-detached  state.  Following  this,  Pi  release 
and  ADP  release  are  accelerated  to  various  extents  by  actin.  As  the  myosin.ADP.Pi 
and  myosin.ADP  states  bind  to  actin  weakly  and  strongly,  respectively,  the  rate  of  Pi 
and  ADP  release  will  define  various  duty  ratios  ranging  from  1  to  2  %  in  rapidly  con- 
tracting fast  skeletal  muscle  to  more  than  70  %  in  myosin  5.  The  physiological  impor- 
tance of  the  adaptable  duty  ratio  is  that  (1)  it  must  be  sufficiently  high  to  maintain 
continuous  actin  attachment  of  a  supramolecular  motor  ensemble  in  order  to  produce 
prolonged  translocation  and  (2)  it  must  be  sufficiently  low  that  the  individual  motor 
units  do  not  pose  a  drag  force  opposing  the  contraction  driven  by  other  motors. 

A  key  kinetic  and  thermodynamic  determinant  of  the  duty  ratio  is  ADP-actin 
coupling,  i.e.,  the  allosteric  effect  of  binding  of  actin  and  ADP  to  the  same  myosin 
head.  If  the  coupling  is  highly  negative,  as  in  rapidly  contracting  muscle  myosins, 
ADP  release  from  (and  subsequent  ATP  binding  to)  actin-bound  myosin  heads  will 
occur  rapidly  and  the  heads  will  only  spend  a  short  time  attached  to  actin.  In  con- 
trast, in  load-bearing  myosins,  the  actin- ADP  coupling  is  low  because  ADP  release 
is  not  or  only  weakly  accelerated  by  actin,  which  results  in  longer  lifetimes  of  actin 
attachment.  For  instance,  this  occurs  in  the  case  of  smooth  muscle  and  non-muscle 
myosin  2  isoforms  [5,  91,  134].  The  structural  basis  of  load-dependent  ADP  release 
is  that  an  additional  lever  swing  occurs  upon  this  step,  first  discovered  by  Milligan, 
Sweeney,  and  colleagues  in  smooth  muscle  myosin  [135].  This  structural  feature 
varies  greatly  between  isoforms,  and  is  in  correspondence  with  their  mechanochem- 
ical properties  [136]. 

In  summary,  the  mechanokinetic  properties  of  myosin  motors  are  shaped  by 
physiological  demands.  In  general  these  features  appear  evolutionarily  far  less  con- 
served than  protein  structure  and  sequence.  These  principles  have  proven  generally 
applicable  to  various  other  motor-track  systems  and  enzyme-nucleotide-effector 
ternary  complexes  [8-10]. 
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Abstract  This  chapter  illustrates  the  dynamic,  evolving  nature  of  molecular 
biophysics  by  providing  perspectives  on  future  prospects  in  three  major  areas:  X-ray 
and  neutron  scattering,  mass  spectrometry,  and  therapeutic  drug  development.  In  all 
three  areas,  major  advances  in  the  biological  sciences,  development  of  powerful 
new  experimental  and  computational  tools,  and  urgent  real- world  challenges  are 
driving  rapid  progress.  These  developments  have  enabled  and  encouraged  biophysi- 
cists  to  focus  increasingly  on  studying  systems  of  various  sizes  and  the  interactions 
between  their  components,  rather  than  simply  on  their  isolated  constituents.  As  the 
examples  demonstrate,  these  interactions  are  often  transient,  and  may  occur  in  mas- 
sive macromolecular  complexes,  between  macromolecules,  or  between  macromol- 
ecules  and  ligands.  A  diverse  set  of  emerging  and  advancing  technologies  are 
likely  to  spur  future  developments.  These  include  advances  in  methods  that  enable 
individual  molecules  to  be  studied  at  atomic  resolution;  high  throughput  methods, 
increasing  automation,  development  of  massive  databases  that  allow  comparison 
and  analysis  of  data  of  many  types  gathered  worldwide;  and  increasingly  power- 
ful computational  methods  that  enable  ever  larger  systems  to  be  modeled  at  high 
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resolution.  Finally,  the  emerging  field  of  synthetic  biology  will  create  exciting 
opportunities  to  create,  explore,  and  manipulate  the  biophysics  of  novel  systems. 

Keywords  Advances  in  computation  •  Database  development  •  High  throughput 
automation  •  Macromolecular  interactions  •  Mass  spectrometry  •  Membrane  pro- 
teins •  Single  molecule  methods  •  Structural  biology  •  Therapeutic  drug  develop- 
ment •  X-ray  and  neutron  scattering 

Molecular  biophysics  is  a  dynamic,  evolving  area  of  science  that  continues  to 
undergo  rapid  change  in  terms  of  the  kinds  of  questions  that  can  be  asked,  and  the 
experimental  and  computational  tools  that  are  available  to  address  them.  This  chap- 
ter presents  perspectives  on  current  challenges  and  future  prospects  in  three  major 
areas  that,  in  combination,  provide  a  snapshot  of  where  the  field  is  now  and  where 
it  is  moving.  The  first  section,  on  X-ray  and  neutron  scattering,  emphasizes  the 
importance  of  biological  questions  in  driving  advances  in  these  technologies.  Many 
of  these  biological  questions  focus  on  interactions,  often  transient,  in  massive  mac- 
romolecular complexes,  between  macromolecules,  or  between  macromolecules  and 
their  ligands.  These  themes  are  amplified  in  the  second  section,  which  describes  the 
explosive  development  of  mass  spectrometry  as  a  powerful  tool  for  characterizing 
the  conformation  and  dynamics  of  membrane  proteins,  large  macromolecular 
assemblies,  highly  heterogeneous  proteins,  and  molecular  interactions  in  vivo. 
Although  the  biological  themes  in  the  two  sections  are  similar,  their  juxtaposition 
reveals  the  complementarity  of  these  two  major  experimental  approaches  and  the 
insights  that  they  provide.  The  last  section,  on  the  use  of  biophysical  methods  in 
therapeutic  protein  development,  illustrates  another  important  trend,  the  rapidly 
increasing  importance  of  molecular  biophysics  in  solving  real- world  problems. 
These  applications  have  also  driven  development  of  the  technology,  in  this  case 
towards  small  volume,  high  throughput  methods.  In  all  of  these  examples,  the  ques- 
tions being  asked  have  resulted  in  advances  in  the  technology,  which  have  led  to 
increased  understanding,  and  consequently  the  ability  to  address  even  more  compli- 
cated questions.  In  this  way,  molecular  biophysics  and  the  fields  to  which  it  is 
applied  constantly  interact  to  advance  together. 


12.1    X-Ray  and  Neutron  Scattering 

X-ray  diffraction  and  neutron  scattering  are  relatively  mature  methods  and  thus  their 
future  prospects  will  be  driven  primarily  by  the  biological  questions  that  must  be 
envisaged  or  answered.  Scattering  methods  are  also  well  advanced;  however,  consid- 
erable technical  development  can  be  anticipated  in  the  foreseeable  future,  particularly 
in  improved  methods  and  facilities  for  data  collection  and  advances  in  sample  prepa- 
ration. Together,  it  can  be  anticipated  that  problems  that  appear  almost  insurmount- 
able at  present  will  become  routine.  The  most  compelling  change  will  be  increasing 
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use  of  scattering  methods  by  newcomers  who  have  not  previously  used  these  methods, 
as  a  result  of  more  widespread  understanding  of  the  fundamentals  and  consequent 
development  of  automated  structural  determination.  These  prospects  are  outlined  for 
crystallography,  fiber  diffraction,  and  small  angle  scattering,  with  challenges  that  lie 
at  the  forefront  of  scattering  and  diffraction  methods  described  first. 


12.1.1    Challenges  at  the  Frontier  of  Structural  Biology 

For  the  most  part,  these  challenges  are  conceptually  well  established,  many  are 
under  active  investigation  and  great  progress  is  expected  in  the  near  future. 
Areas  where  there  are  major  opportunities  to  enhance  understanding  of  cellular 
function  include  the  architecture  of  microtubule  organizing  centers,  kineto- 
chores,  nuclear  pore  complexes,  multiprotein  membrane  complexes  found  at  the 
interfaces  between  cells,  and  spliceosomes.  Inherent  in  all  of  these  research 
areas  are  interactions  between  macromolecules  and  ligands.  Interactions 
between  the  components  of  a  cell  will  remain  the  focus  of  structural  biology  for 
many  years  to  come  and  represent  a  real  change  in  what  is  expected  from  a 
structural  investigation.  In  the  early  days  of  X-ray  crystallography,  it  was  suf- 
ficient to  determine  the  structures  of  the  components.  Initially  every  structure  of 
a  protein  or  nucleic  acid  was  considered  a  major  advance  with  little  regard  to  the 
ligands  or  macromolecular  interactions  involved.  However,  every  protein, 
nucleic  acid,  oligosaccharide,  and  small  molecule  ligand  interacts  with  some- 
thing else  in  the  cellular  system.  Consequently,  structure  determination  today 
goes  hand  in  hand  with  biochemical  and  cellular  studies  that  examine  the 
hypotheses  that  arise  from  the  structures  themselves.  This  is  because  the  focus 
has  moved  away  from  methodological  development  back  towards  understanding 
biological  phenomena.  This  progress  has  been  accompanied  by  an  increase  in 
the  size  and  complexity  of  the  biological  systems  that  can  be  investigated. 

The  traditional  approach  in  macromolecular  structure  has  been  to  divide  the 
problem  into  the  smallest  pieces  that  yield  useful  information  and  are  amenable  to 
study  and  then  to  construct  a  conceptual  model  of  the  original  larger  entity  from  the 
pieces.  As  techniques  for  determining  the  structures  of  large  complexes  have 
improved  the  size  of  the  structures  that  can  be  studied  has  steadily  increased  so  that 
less  division  is  required.  This  trend  is  likely  to  continue.  The  challenge  with  com- 
plexes such  as  the  nuclear  pore  complex  [1]  or  kinetochores  [2]  is  to  isolate  stable 
subassemblies  that  will  crystallize  in  a  form  that  yields  useful  structural  informa- 
tion. Great  progress  has  been  made,  but  larger  complexes  that  will  ultimately  lead  to 
a  complete  model  are  still  required.  The  next  frontier  in  many  areas  will  be  to  define 
the  transitory  interactions  between  molecules. 

Most  of  the  structures  of  complexes  that  have  been  determined  thus  far  repre- 
sent stable  complexes  (dissociation  constants  in  the  low  micromolar  or  nanomolar 
range),  but  many  interactions  in  biology  are  much  weaker,  transitory,  or  modu- 
lated by  posttranslational  modification  or  small  molecule  ligands.  This  is 
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particularly  true  when  interactions  involve  an  ensemble  of  weaker  interactions 
such  as  those  associated  with  the  cytoskeleton.  These  studies  will  necessarily 
interface  with  results  from  electron  microscopy  that  can  provide  a  big  picture  of  a 
macromolecular  assembly.  Enhanced  use  of  molecular  modeling  will  eventually 
become  a  vital  tool  in  these  studies,  because,  even  with  large  complexes,  interac- 
tions between  a  comparatively  small  number  of  side  chains  or  functional  groups 
can  profoundly  influence  the  behavior  of  a  biological  system.  Most  interactions  in 
biology  are  dominated  by  hydrogen  bonds  and  hydrophobic  interactions.  Hence 
high  resolution  structures  of  components  will  continue  to  be  essential,  but  these 
will  have  to  be  incorporated  into  a  larger  model. 


12.1.2    Macromolecular  Crystallography:  X-Rays 

Conventional  structural  determination  will  almost  certainly  become  increasingly 
routine.  The  major  developments  in  this  area  will  be  dominated  by  robotic  protein 
preparation,  crystal  growth  and  handling,  automated  data  collection,  and  structural 
determination.  This  approach  has  been  pioneered  by  the  efforts  in  structural  genom- 
ics, but  is  rapidly  becoming  the  standard  mode  of  operation  for  data  recorded  with 
synchrotron  radiation.  These  techniques  allow  non-expert  users  to  incorporate 
X-ray  structural  studies  in  their  research  protocols. 

The  most  challenging  technical  problems  in  X-ray  crystallography  lie  with  mas- 
sive macromolecular  complexes,  transitory  interactions  between  molecules,  and 
problems  that  yield  vanishingly  small  crystals.  At  the  frontiers  of  difficult  struc- 
tures, considerable  advances  are  expected,  driven  by  current  developments  in  detec- 
tor technology  (pixel  array  detectors)  coupled  with  the  ability  to  record  high  quality 
data  from  exceedingly  small  crystals  (1-5  um).  As  a  consequence,  considerably  less 
material  is  needed  for  a  complex  structural  study  than  was  once  deemed  necessary 
(micrograms  to  milligrams).  Increasing  emphasis  will  be  placed  on  determining  the 
structures  of  large  macromolecular  complexes,  recognizing  that  proteimprotein  and 
protein: nucleic  acid  interactions  dominate  much  of  cell  biology.  The  crystals  of 
most  of  these  complexes  will  not  diffract  to  high  resolution  and  will  thus  require 
new  approaches  to  determining  low  resolution  structures.  Development  of  suitable 
metrics  for  assessing  the  reliable  information  content  of  these  structures  will  be 
critical  for  the  outside  reader  or  user  of  this  structural  data. 

Another  area  that  will  see  rapid  growth  is  the  crystallographic  study  of  integral 
membrane  proteins.  These  have  lagged  behind  soluble  cytosolic  and  extracellular 
proteins  because  they  are  difficult  to  prepare  and  crystallize.  Even  when  they  do 
crystallize,  most  crystals  of  membrane  proteins  do  not  diffract  well.  The  improve- 
ments in  detector  and  crystallization  technology  are  expected  to  have  a  profound 
impact  in  this  area.  Structural  studies  of  membrane  proteins  will  also  be  strongly 
influenced  by  developments  in  low  resolution  structural  determination. 

Synchrotron  radiation  has  revolutionized  X-ray  structural  determination,  but 
even  though  high  resolution  data  can  now  be  recorded  in  a  few  minutes  with  pixel 
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array  detectors,  radiation  damage  is  still  a  major  problem.  Recent  developments  in 
free-electron  lasers  that  deliver  ultrashort  flashes  in  the  femtosecond  range  of  radia- 
tion have  the  potential  to  overcome  this  challenge  [3].  With  the  use  of  ultrashort 
pulses  of  X-rays,  the  data  can  be  recorded  before  the  crystal  has  a  chance  to  disin- 
tegrate or  suffer  radiation  damage.  This  technology  also  creates  an  opportunity  to 
examine  even  smaller  crystals  (less  than  1  urn)  and  will  facilitate  study  of  macro- 
molecules  that  are  difficult  to  crystallize.  This  is  a  highly  challenging  approach 
since  it  requires  combining  scattering  data  from  millions  of  diffraction  experiments; 
however,  improvements  in  automated  data  collection  and  sample  handling  are 
expected  to  simplify  this  approach  for  important  structural  problems. 


12.1.3    Small  Angle  Scattering:  X-Rays 

As  is  the  case  for  macromolecular  crystallography,  the  results  from  small  angle 
scattering  will  increasingly  be  utilized  by  investigators  who  are  not  experts  in  the 
field  and  thus  will  require  improvements  in  automation  and  validation  to  ensure 
high  quality  routine  data  collection  and  appropriate  interpretation  of  the  results  [4]. 
A  large  part  of  the  effort  to  increase  the  use  of  small  angle  scattering  will  be  associ- 
ated with  continued  development  of  algorithms  needed  for  ab  initio  modeling  of  the 
scattering  data  and  interfacing  the  results  with  those  derived  from  other  biophysical 
techniques  such  as  NMR  and  crystallography.  In-house  facilities  have  shown  dra- 
matic improvements  in  recent  years  and  have  significantly  increased  the  number  of 
users,  however,  synchrotron  radiation  will  continue  to  play  a  major  role  because  of 
the  improved  signal  to  noise  and  speed  of  data  collection.  A  standard  set  of  valida- 
tion tools  and  protocols  for  depositing  and  reporting  the  results  from  small  angle 
scattering  studies  will  be  needed  to  optimize  the  investment  in  this  technique. 


12.1.4   Neutron  Scattering  Methods 

Neutrons  provide  an  enormously  powerful  alternative  to  X-rays  because  of  the  greater 
scattering  power  of  hydrogen  and  deuterium  relative  to  other  elements  in  biological 
molecules.  The  high  scattering  power  of  these  elements  makes  possible  contrast  varia- 
tion in  scattering  studies  and  the  localization  of  hydrogen  atoms  in  X-ray  structures. 
In  contrast,  hydrogen  atoms  are  not  observed  in  X-ray  studies  except  at  ultra-high 
resolution.  The  only  restriction  on  the  routine  usage  of  neutron  scattering  is  limited 
access  to  neutron  sources  and  the  length  of  time  required  for  adequate  data  collection. 
Nuclear  reactors  have  been  the  mainstay  of  neutron  sources  throughout  the  world,  but 
more  recently  spallation  sources  have  been  coming  online.  These  accelerator-driven 
sources  provide  beams  of  pulsed  neutrons  that  are  considerably  more  intense  than 
those  available  from  other  sources.  It  can  be  expected  that  these  sources  will  encour- 
age greater  use  of  neutrons  in  the  biophysical  studies  of  macromolecules. 
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12.2    Mass  Spectrometry 

Although  one  of  the  "youngest"  analytical  techniques  in  the  experimental  arsenal  of 
biophysics,  mass  spectrometry  (MS)  has  already  established  itself  as  an  indispens- 
able tool,  providing  answers  to  challenging  problems  that  cannot  be  addressed  using 
other  approaches.  The  list  of  targets  suitable  for  MS  analysis  continues  to  expand, 
with  many  applications  that  seemed  ground-breaking  only  a  few  years  ago  now 
becoming  routine  and  commonplace.  As  the  entire  field  of  biophysics  continues  to 
advance,  MS  is  expanding  its  scope  of  inquiry  to  include  such  challenging  targets  as 
membrane  proteins,  large  macromolecular  assemblies,  and  many  others.  For  MS,  as 
for  other  areas  of  biophysics,  the  greatest  challenge  is  to  break  away  from  the  reduc- 
tionist paradigm  and  embrace  the  complexity  of  living  systems. 


12.2.1    Characterization  of  Conformation  and  Dynamics 
of  Membrane  Proteins  by  MS 

Although  membrane  proteins  constitute  about  one-third  of  the  entire  proteome,  the 
three-dimensional  structures  of  only  357  unique  membrane  proteins  were  available 
as  of  September,  2012.  The  architecture  and  dynamics  of  membrane  proteins  are 
defined  by  a  wide  range  of  intermolecular  forces,  including  interactions  with  the 
hydrophobic  interior  of  the  membrane,  its  polar  solvent  interface  region,  as  well  as 
internal  and  external  water  molecules.  As  a  result,  membrane  proteins  generally 
have  very  poor  solubility  characteristics,  making  any  experimental  study  of  the 
architecture,  dynamics,  and  interactions  of  these  macromolecules  extremely  diffi- 
cult. Traditionally,  solubilization  and  isolation  of  membrane  proteins  relied  on 
detergents,  but  many  earlier  attempts  to  characterize  detergent- solubilized  mem- 
brane proteins  by  MS  had  very  little  success  because  of  the  suppressive  effect  of 
detergents  [5].  While  various  techniques  that  remove  detergents  prior  to  MS  analy- 
sis remain  the  most  popular  strategy  for  dealing  with  this  problem,  such  a  dramatic 
change  in  the  environment  of  the  protein  inevitably  leads  to  the  loss  of  higher  order 
structure.  Fortunately,  small  amounts  of  detergents  can  be  tolerated  by  MS  at  least 
in  some  cases,  allowing  direct  ESI  MS  analyses  of  non-covalently  bound  membrane 
protein  assemblies  to  be  carried  out  after  reconstituting  them  in  a  minimum  amount 
of  detergent  [6].  A  similar  approach  was  used  recently  to  study  very  large  non- 
covalent  assemblies  of  transmembrane  proteins  [7]. 

Despite  initial  successes  in  using  detergents  for  direct  MS  characterization  of 
membrane  proteins,  one  must  be  aware  of  some  potential  pitfalls,  the  most  serious 
of  which  is  the  denaturing  action  of  many  (if  not  all)  detergents.  An  ideal  membrane 
mimetic  would  not  only  form  a  bilayer-based  structure,  but  also  reflect  the  physical 
properties  of  the  specific  biological  membrane.  Several  MS-based  experimental 
approaches  are  currently  under  investigation  as  potential  probes  of  the  structure  and 
behavior  of  membrane  proteins  with  bilayer-based  membrane  mimics.  These 
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include  limited  proteolysis  to  identify  membrane-bound  protein  segments,  chemical 
probes  to  obtain  topological  information  on  various  protein  segments,  and  hydro- 
gen-deuterium exchange  to  provide  information  on  interfacial  positioning  and  sta- 
bility of  transmembrane  polypeptides  in  lipid  bilayers.  Another  recently  introduced 
bilayer-based  membrane-mimicking  system  is  a  nanodisc  where  the  bilayer  struc- 
tures are  maintained  by  membrane  scaffold  proteins  modeled  after  apolipoprotein 
Al.  However,  the  best  environment  to  study  the  behavior  of  membrane  proteins  is 
indisputably  the  specific  biological  membrane  itself.  Although  characterization  of 
various  properties  of  membrane-bound  proteins  within  the  context  of  their  native 
environment  using  MS  was  a  technical  impossibility  until  very  recently,  several 
examples  of  such  studies  have  been  published  in  the  past  few  years  [8,  9]. 


12.2.2    Mass  Spectrometry  Above  1  MDa 

12.2.2.1  Characterization  of  Large  Macromolecular  Assemblies 

Large  protein  assemblies  play  crucial  roles  in  a  variety  of  cellular  functions.  For 
example,  each  cellular  protein  emerges  from  a  large  assembly  upon  its  birth  (ribo- 
some),  enters  another  large  assembly  at  the  end  of  its  life  (proteasome),  and  inter- 
acts with  a  number  of  other  macromolecular  assemblies  throughout  its  lifetime. 
While  the  ability  of  electrospray  ionization  mass  spectrometry  (ESI  MS)  to  detect 
and  characterize  relatively  modest  non-covalent  assemblies  of  proteins  and  other 
biopolymers  (e.g.,  protein/DNA  complexes)  was  recognized  over  20  years  ago  and 
has  been  used  actively  since,  large  macromolecular  assemblies  representing  com- 
plete self-contained  units  of  biological  machinery  (such  as  ribosomes  and  protea- 
somes)  remained  out  of  reach  of  MS  analysis  for  a  much  longer  period  of  time. 

The  situation  began  to  change  in  the  past  decade  as  a  result  of  pioneering  work 
of  Robinson  [10]  and  Heck  [11],  who  demonstrated  that  careful  control  of  ioniza- 
tion conditions  and  use  of  mass  analyzers  with  extended  mlz  range  may  allow  very 
large  non-covalent  complexes  to  be  preserved  in  the  gas  phase,  and  meaningful 
structural  information  to  be  extracted  for  protein  assemblies  whose  masses  exceed 
several  MDa.  Although  still  far  from  being  a  routine  method  of  analysis  of  large 
macromolecular  complexes,  the  so-called  native  mass  spectrometry  is  now  capable 
of  dealing  with  complex  objects  ranging  from  proteasomes  and  ribosomes  to  intact 
viral  capsids. 

12.2.2.2  MS  of  Highly  Heterogeneous  Proteins 

Despite  the  dramatic  expansion  of  the  mass  limit  of  macromolecules  for  which 
meaningful  information  can  be  provided  by  MS,  the  bar  remained  disappointingly 
low  until  recently  for  MS  analysis  of  several  classes  of  proteins.  These  include 
extensively   glycosylated   proteins   and   protein-polymer   conjugates,  which 


372 


N.M.  Allewell  et  al. 


frequently  exhibit  remarkable  degrees  of  structural  heterogeneity.  Heterogeneity 
poses  a  formidable  challenge  to  MS-based  studies  of  higher  order  structure,  dynam- 
ics, and  interactions  of  such  proteins,  frequently  making  the  mundane  task  of  mass 
measurement  an  extremely  challenging  undertaking.  Among  several  recent  devel- 
opments in  this  field,  a  particularly  promising  approach  combines  reduction  of  com- 
plexity of  the  protein  ion  ensemble  (by  mass  selecting  a  narrow  fraction  of  the  entire 
ionic  population)  and  gas  phase  chemistry  (charge  reduction  via  electron  capture  or 
electron  transfer)  [12]. 

Another  MS  technique  that  holds  great  promise  vis-a-vis  dealing  with  macromo- 
lecular  complexity  is  ion  mobility  (IM)  MS  [13].  While  the  majority  of  current 
applications  of  this  technique  exploit  its  ability  to  provide  information  on  the  physi- 
cal size  of  macromolecular  ions  in  the  gas  phase,  the  potential  utility  of  IM  MS  to 
provide  an  additional  separation  stage  prior  to  MS  detection,  thereby  reducing  com- 
plexity of  heterogeneous  systems,  is  frequently  overlooked.  Nevertheless,  the  abil- 
ity of  IM  MS  to  separate  various  isoforms  of  biopolymers  has  been  acknowledged 
and  has  already  been  used  to  facilitate  MS  characterization  of  covalent  structure  of 
large  glycoproteins  [14]  and  protein-polymer  conjugates  [15]. 

12.2.2.3    Mass  Spectrometry  In  Vivo 

A  very  important  aspect  of  macromolecular  interactions  in  vivo  is  their  extreme 
complexity  due  to  the  large  number  of  participating  players.  While  most  biophysi- 
cal studies  have  traditionally  used  the  so-called  reductionist  approach  by  focusing 
attention  only  on  the  minimal  number  of  players  deemed  absolutely  essential  for  a 
particular  process  or  interaction,  the  limitations  of  this  approach  are  now  becoming 
commonly  acknowledged.  Emergence  of  the  new  paradigm  that  embraces,  rather 
than  downplays,  the  complexity  of  biological  processes  has  been  catalyzed  by  the 
completion  of  genome  sequencing  for  several  organisms,  which  highlighted  the 
enormous  repertoire  of  biomolecules  making  up  living  cells. 

One  approach  to  dealing  with  the  complexity  of  real-living  systems  that  enjoyed 
great  popularity  in  the  past  decade,  is  functional  proteomics  [16-22].  Above  and 
beyond  proteomic  approaches  that  provide  a  global  picture  of  biomolecular  interac- 
tions in  living  systems,  a  number  of  groups  are  beginning  to  invest  significant  effort  in 
expanding  the  existing  experimental  strategies  to  study  biomolecules  in  their  native 
environment.  These  include  the  possibilities  for  investigation  of  protein  structure  and 
interactions  in  living  cells  provided  by  chemical  cross-linking  with  MS  detection  [23], 
or  chemical  labeling  and  footprinting  methods  [24].  Efficient  delivery  of  cross-linking 
and/or  labeling  reagents  to  the  cell  without  disrupting  its  normal  functioning  or  indeed 
killing  it  remains  a  formidable  challenge.  This  obstacle  places  a  significant  limitation 
on  the  number  of  reagents  that  can  be  used  in  such  in  vivo  measurements.  One  particu- 
larly attractive  approach  to  overcoming  this  problem  would  tap  into  the  arsenal  of  the 
emerging  field  of  synthetic  biology  by  reprogramming  the  genetic  code  of  the  cell, 
forcing  it  to  produce  and  incorporate  into  proteins  amino  acids  with  reactive  side 
chains  that  can  be  used  as  in  situ  chemical  probes  [25]. 
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The  past  two  decades  witnessed  many  triumphs  of  MS  in  various  subfields  of 
biophysics  and  structural  biology,  and  it  is  certain  that  this  technique  will  remain  a 
valuable  contributor  in  these  fields,  catalyzing  their  progress  in  the  years  to  come 
and  bringing  about  new  exciting  discoveries.  Despite  reaching  a  respectable  age, 
biological  MS  remains  very  dynamic  and  constantly  adapts  to  the  ever  changing 
landscape  in  the  life  sciences,  always  remaining  at  the  forefront  and  ready  to  deal 
with  the  most  challenging  problems. 


12.3    The  Future  of  Biophysical  Analysis  in  Therapeutic 
Protein  Development 

The  development  of  therapeutic  proteins  is  an  endeavor  that  includes  extensive  use 
of  biophysical  techniques,  as  has  been  described  in  numerous  publications  (see,  for 
example  [26]).  As  is  the  case  for  all  proteins,  proteins  being  developed  and  used  for 
therapeutics  are  complex  macromolecules  that  require  appropriate  primary,  second- 
ary, and  tertiary  structure  to  maintain  their  function  and  stability,  as  discussed  in 
Chap.  2.  The  ultimate  goal  of  therapeutic  development  is  the  creation  of  a  molecule 
that  is  safe  and  efficacious,  and  that  will  maintain  its  structural  integrity  during 
manufacturing,  storage  (usually  two  years,  often  in  solution,  and  under  variable 
conditions),  and  administration.  Different  biophysical  tools  are  employed  during 
the  different  stages  of  development  of  this  important  class  of  drugs,  depending  on 
the  amount  of  material  and  time  available,  and  the  goal  of  the  analysis. 

The  protein  therapeutic  development  lifecycle  includes  several  steps,  beginning 
with  the  identification  of  a  biological  target.  After  the  target  has  been  chosen,  the 
molecule  with  the  greatest  chance  of  succeeding  as  a  drug  and  with  the  desired 
biological  activity  must  be  selected  from  multiple  candidates  with  different  primary 
sequences.  Following  the  choice  of  candidate,  process  and  formulation  develop- 
ment, and  characterization  are  the  next  steps,  with  selection  of  delivery  device  and 
route  of  administration  coming  next.  During  all  of  these  steps,  the  integrity  of  the 
protein,  in  terms  of  its  secondary,  tertiary,  and  quaternary  structure  needs  to  be 
maintained.  The  last  step  in  this  process  is  clinical  trials  and  then,  if  successful, 
commercialization.  During  these  later  stages  of  development,  the  focus  is  on  prod- 
uct consistency  and  lot  release  assays,  exploration  of  different  delivery  devices  and 
therapeutic  indications,  comparability  assessments,  and  support  for  product  and 
process  failure  investigations. 

Biophysical  tools  are  used  at  all  of  these  stages  in  the  therapeutic  protein  lifecycle. 
Currently  characterization  is  done  by  removing  an  aliquot  of  the  sample  and  analyz- 
ing specific  properties  with  different  techniques,  and  then  using  heuristics  to  com- 
bine the  results.  This  can  be  time-consuming  and  involves  multiple  aliquots  and  a 
fair  amount  of  material.  The  desired  future  state  for  biophysical  assessment  of  pro- 
tein therapeutics  would  include  the  ability  to  do  multiple  analyses  on  the  same  sam- 
ple at  all  stages  of  development.  This  would  increase  the  reliability  of  the  results 
because  different  attributes  could  be  directly  compared,  and  also  enable  the  testing 
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Table  12.1  The  goals  and  challenges  of  the  "biophysics  of  the  future  of  therapeutic  protein 
development" 


Goal/challenge 


Current 


Desirable 


Analysis  at  high/low 
concentration 


High  throughput 

Noninvasive 

Online  biophysics  for 
process  control 


Many  biophysical  techniques 
require  dilution  to  0.5-1  mg/ 
mL  range,  others  require 
concentration  to  >10  mg/mL 

Many  techniques  are  labor- 
intensive  and  low  throughput 

Requires  removing  sample  from 
device  (vial  or  syringe) 

Discrete  sampling,  off-line  and 
often  time-consuming 
analysis 


At  actual  formulation  concentration 


Automation,  high  throughput  data 

collection  and  analysis 
In  situ  analysis 

Online  sampling  and  analysis  during 
fermentation  and  purification,  the 
ability  to  make  changes  based  on 
result  obtained  on  the  fly 


of  more  samples  to  better  understand  the  variability  of  the  methods.  The  vision  of 
the  future  for  the  application  of  biophysics  during  protein  development  also  includes 
being  able  to  perform  these  tests  on  actual  process  samples  and  to  obtain  results  in 
real-time  so  that  decisions  can  be  made  based  on  the  identity,  conformation  of  the 
protein,  and  the  state  of  aggregation.  Automation  and/or  use  of  easy-to-operate 
instrumentation  for  these  techniques  in  a  manufacturing  environment  is  another 
important  goal  for  which  to  strive.  Different  phases  of  drug  product  development 
have  different  specific  needs  as  well  that  could  result  in  the  evolution  of  different 
instrumentation  and  applications  in  the  future.  This  issue  is  discussed  below  and 
briefly  summarized  in  Table  12.1. 

During  the  selection  of  the  unique  protein  that  will  become  the  product  candidate, 
in  addition  to  biological  activity,  the  stability  of  the  candidates  under  consideration  to 
the  conditions  used  for  manufacturing  and  storage  is  assessed.  Characteristics  to  be 
considered  include  stability  to  low  pH,  agitation,  mixing  of  the  air-liquid  interface, 
and  temperature.  The  protein  therapeutic  also  needs  to  withstand  storage  in  solution  at 
4-8  °C  for  two  years,  often  at  protein  concentrations  above  100  mg/mL  [27] .  Screening 
for  this  type  of  stability  usually  involves  predictive  assays  that  rely  on  subjecting  the 
protein  to  harsher  conditions  than  it  would  encounter  normally,  in  order  to  predict 
what  may  happen  with  time  under  milder  conditions.  This  requires  an  understanding 
of  potential  pathways  of  degradation,  in  order  to  ensure  that  the  response  of  the  pro- 
tein to  the  conditions  used  are  truly  predictive  of  long-term  stability  during  the  actual 
process.  After  stressing  the  material,  the  impact  of  the  conditions  on  the  integrity  of 
the  protein,  with  particular  emphasis  on  protein  aggregation  and  irreversible  unfold- 
ing of  the  native  three  dimensional  structures,  is  assessed.  Assays  with  minimal  mate- 
rial requirements  and  high  throughput  are  especially  valuable  at  these  early  stages. 
Qualitative  results  that  allow  comparison  of  the  relative  degree  of  change  so  that  can- 
didates can  be  categorized  as  passing  or  failing  are  an  acceptable  output. 

The  ability  to  assess  multiple  different  protein  characteristics  on  a  single  sample 
after  each  stress,  rather  than  having  to  remove  aliquots  followed  by  sample  manipu- 
lation in  order  to  be  compatible  with  the  different  analyses,  would  be  hugely 
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valuable  at  this  stage.  Primary  attributes  that  should  be  assessed  are  conformational 
and  colloidal  stability,  the  propensity  of  the  protein  to  aggregate,  and  chemical  mod- 
ification of  the  amino  acid  residues  in  the  primary  structure.  The  ability  to  measure, 
either  directly  or  indirectly,  the  primary,  secondary,  tertiary,  and  quaternary  struc- 
ture of  a  protein,  and  the  size  of  any  aggregated  species  generated,  all  in  a  high 
throughput  format,  is  the  ultimate  goal.  The  more  candidates  that  can  be  assessed, 
the  greater  the  chance  of  identifying  one  that  has  the  desired  properties,  and  so  the 
availability  of  automated  methods  is  also  important.  In  the  future  one  can  envision 
a  robotic  system  that  subjects  samples  to  different  stresses  such  as  elevated  tempera- 
ture, extremes  of  pH  and  ionic  strength,  mechanical  shaking  or  stirring,  exposure  to 
light,  etc.,  and  then  runs  the  multi-well  plates  through  sequential  biophysical  analy- 
ses, ultimately  providing  a  relative  ranking  of  the  candidates  based  on  a  multivariate 
analysis  of  the  matrix  of  data  generated.  It  is  worth  noting  that  in  some  cases  in 
addition  to  the  traditional  biophysical  techniques  (such  as  MS,  different  types  of 
spectroscopy)  other  methods,  such  as  chromatography,  can  often  be  used  as  part  of 
this  assessment.  For  example,  ion  exchange  chromatography  can  detect  changes  in 
chemical  modification,  hydrophobic  interaction  chromatography  can  be  used  to 
detect  changes  in  conformation,  and  size  exclusion  chromatography  can  follow  loss 
of  monomer,  or  formation  of  smaller  oligomers  such  as  dimers  and  tetramers  [28]. 

An  important  aspect  of  this  early  stage  of  development  is  the  feedback  between 
protein  engineering,  modeling,  and  the  results  of  the  predictive  assays.  There  is  an 
iterative  process  as  the  correlation  between  the  predicted  behavior,  the  actual  behavior 
as  the  protein  moves  through  process  development,  and  the  structure  of  the  modeled 
protein  becomes  available.  Collecting  these  data  into  usable  databases  allows  constant 
improvement  in  the  sequence-based  predictive  algorithms,  such  that  more  and  more 
of  the  potential  "hot  spots"  for  modification  or  self-association  can  be  eliminated 
before  the  protein  is  ever  included  in  the  panel  of  potential  candidates  to  be  screened. 

The  focus  changes  to  developing  the  production  process  and  formulation  to  be 
used  for  the  commercial  product  once  the  specific  molecule  that  will  be  developed  as 
a  therapeutic  has  been  chosen.  At  this  stage,  material  availability  is  no  longer  rate 
limiting,  and  more  rigorous  techniques  that  compare  the  higher  order  structure  of  the 
actual  material  obtained  during  the  different  processing  steps  can  be  used  to  ensure 
that  the  final  product  was  not  irreversibly  damaged  by  the  conditions  being  used  for 
its  manufacture.  The  ability  to  get  real-time,  high  resolution  information  on  the  pri- 
mary, secondary,  and  tertiary  structure,  and  especially  of  the  aggregation  state,  of 
samples  as  they  are  generated  by  the  cells,  and  passed  through  the  purification  pro- 
cess, would  allow  decisions  about  sample  collection  to  be  made  based  on  the  quality 
of  the  material  as  it  was  being  processed.  This  requires  online  instruments  that  are 
robust  enough  to  withstand  the  conditions  of  a  protein  manufacturing  plant,  and  are 
also  rapid  enough  to  provide  results  in  time  to  be  used  to  make  process  conditions. 
Online  light  scattering  analysis  to  assess  aggregation;  Raman  spectroscopy  to  assess 
the  secondary  and  tertiary  structure  of  the  protein;  mass  spectrometry  to  determine 
primary  structure  including  amino  acid  sequence,  carbohydrate  content,  and  chemical 
modification;  and  morphological  analysis  to  assess  types  of  aggregate  are  some  of  the 
potential  process  analytical  technologies  that  are  currently  being  explored. 
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During  formulation  development,  the  stability  of  the  target  protein  is  assessed  in 
different  buffer  compositions,  pH,  storage  conditions,  and  delivery  devices.  These 
studies  typically  involve  the  generation  of  many  samples  that  must  be  analyzed  in 
order  to  arrive  at  the  optimal  formulation  conditions,  and  thus  many  of  the  princi- 
ples that  apply  during  candidate  selection  apply  here  as  well  and  some  of  the  same 
assays  can  be  used.  The  primary  difference  is  that  at  this  point  in  the  development 
lifecycle  the  amount  of  material  is  no  longer  rate  limiting  and  so  formats  other  than 
the  96  (or  more)  well  plates  can  be  considered.  However,  an  instrument  that  uses 
robotics  to  stress  and  test  multiple  samples  for  several  attributes  simultaneously  is 
still  the  goal.  Special  attention  should  be  paid  to  the  aggregation  state  and  the  integ- 
rity of  the  primary  sequence  of  the  protein.  Ideally  the  analyses  would  occur  under 
the  actual  solution  and  storage  conditions  that  would  be  used,  including  protein 
concentration.  The  majority  of  the  protein  therapeutics  under  development  will  be 
administered  at  high  concentration,  and  so  the  ability  to  determine  these  properties 
without  dilution  is  an  important  consideration. 

As  the  product  moves  into  production,  the  emphasis  switches  from  developing 
and  optimizing  conditions  to  maintaining  process/product  control;  biophysical  tech- 
niques to  follow  the  protein  higher  order  structure  are  important  elements  of  com- 
parability studies,  and  are  required  for  obtaining  licensure  of  the  drug.  In  this  case 
the  methods  must  be  shown  to  be  fit  for  the  purpose  and  the  sensitivity  of  the  assays 
to  detect  changes  in  the  product  must  be  determined.  Another  important  aspect  of 
preclinical  and  clinical  development  is  the  monitoring  of  stability  samples,  stored 
both  under  accelerated  and  recommended  conditions,  for  comparability.  Biophysical 
techniques  are  also  used  as  tools  to  help  ensure  that  changes  in  device,  concentra- 
tion, and  formulation  made  as  different  indications  or  patient  populations  are  added 
do  not  affect  the  conformation  of  the  biotherapeutic.  Techniques  that  can  give  repro- 
ducible and  accurate  results,  and  where  the  readout  is  understood,  are  most  com- 
monly used  at  this  stage  of  development,  rather  than  the  high  throughput  tests  that 
were  employed  in  the  beginning  of  the  product  development  lifecycle.  These  analy- 
ses must  be  sensitive  to  changes  in  the  protein  conformation  that  can  occur  if  the 
protein  is  exposed  to  slightly  different  process  or  storage  conditions,  as  demon- 
strated by  samples  exposed  to  conditions  outside  the  normal  parameters.  At  this 
stage  future  directions  lie  in  the  ability  to  carry  out  multiple  biophysical  tests  on  the 
same  sample  in  the  commercial  formulation,  removed  directly  from  the  commercial 
delivery  device.  This  capability  would  allow  for  testing  of  a  statistically  relevant 
number  of  samples,  and  direct  comparison  of  the  results.  One  difficulty  with  the 
current  tools  available  for  biophysical  characterization  of  proteins  is  that  most  of 
these  lack  the  sensitivity  necessary  to  detect  changes  of  less  than  5  %.  They  also 
provide  information  on  the  average  of  the  molecular  population.  Thus  even  when  a 
difference  is  detected,  it  is  not  possible  to  determine  whether  6  %  of  the  population 
has  lost  all  signals  in  that  assay,  or  100  %  of  the  population  has  lost  6  %  of  the  sig- 
nal. Evolution  of  single  molecule  methods  to  the  point  where  they  are  applicable  for 
quick  and  reproducible  analyses  with  very  little  variability  would  be  a  huge  step 
forward  in  our  ability  to  interrogate  samples  and  truly  understand  if  they  are  com- 
parable or  not.  The  ability  to  apply  high  resolution  methods  such  as  NMR  and  MS 
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to  gain  better  understanding  of  protein  higher  order  structure  down  to  a  single 
residue  is  being  explored  as  one  avenue  to  obtain  this  type  of  information.  The  abil- 
ity to  track  and  characterize  minor  species  in  the  structural  ensemble  that  is  present 
at  any  given  time  in  a  protein  solution  would  also  contribute  to  this.  Finally,  proteins 
are  not  static  species,  but  are  truly  dynamic  and  can  sample  multiple  folding  struc- 
tures as  part  of  the  natural  thermodynamic  equilibrium  of  the  states  possible  in 
solution.  Techniques  to  provide  quick  assessments  of  the  dynamics  of  any  given 
protein  solution  would  also  be  helpful  for  this  stage  of  development. 

Finally,  during  commercial  production  batches  occasionally  fail  the  various  lot 
release  assays;  biophysical  techniques  can  be  used  to  help  identify  the  root  cause 
and  contribute  to  the  safety  assessment  of  the  different  lots  of  protein  produced 
under  supposedly  equivalent  conditions.  In  this  case  very  often  a  single  sample  is 
being  tested,  with  a  single  visible  aggregate  being  studied,  and  so  the  methods  must 
have  the  sensitivity  to  detect  and  analyze  a  very  small  amount  of  protein  and  provide 
a  positive  identification  of  the  material  if  possible.  For  this  application  ideal  future 
biophysical  tools  would  include  analysis  by  mass  spectrometry  for  molecule  identi- 
fication and  determination  of  any  chemical  modification,  as  well  as  analysis  of  the 
conformation  of  the  protein,  and  the  aggregation  state.  This  analysis  should  occur 
in  situ  in  the  glass  vial,  syringe,  or  other  device  used  to  administer  the  drug  to  the 
patient,  and  all  the  analyses  must  be  performed  on  the  same  particle  or  other  species 
that  resulted  in  the  lot  release  failure.  While  throughput  is  important,  the  ability  to 
obtain  very  reliable  results  from  such  a  small  sample  set  is  far  more  important  than 
throughput  at  this  stage. 

As  illustrated  in  Table  12.1,  and  from  the  discussion  above,  there  are  many  gaps 
between  the  current  state  of  protein  biophysical  characterization  during  biothera- 
peutic  development  and  the  desired  future  state.  While  challenging,  much  progress 
has  been  made  in  recent  years.  The  evolution  of  computational  and  material  sci- 
ences is  resulting  in  miniaturization  of  instrumentation  to  the  point  where  the  "lab 
on  a  microchip"  will  become  feasible.  The  development  of  high  throughput,  auto- 
mated, instruments  that  assess  more  than  one  attribute  on  these  chips,  coupled  with 
sophisticated  statistical  calculations  and  multivariate  analysis  of  the  information 
does  not  seem  outside  the  realm  of  possibility  in  the  relatively  near  future. 


12.4  Conclusions 

This  chapter  has  provided  perspectives  on  future  directions  in  three  major  areas  of 
molecular  biophysics,  as  examples  of  what  the  future  holds.  While  many  advances 
will  be  specific  to  a  particular  field,  there  are  a  several  overarching  themes  that  are 
common  to  the  three  topics  discussed  here  as  well  as  many  other  areas  of  biology. 
Throughout  biophysics,  the  focus  is  moving  from  isolated  components  to  entire 
systems.  At  the  same  time  single  molecule  analyses  will  increasingly  enable  us  to 
visualize  and  characterize  minority  species  against  the  background  of  the  entire 
molecular  population,  including  transitory  states.  The  evolution  of  high  throughput 
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methods  will  result  in  an  increase  in  throughput  and  a  decrease  in  the  amount  of 
material  required.  Increasing  use  of  automation  will  make  biophysical  approaches 
accessible  to  a  wider  group  of  users,  and  application  to  a  larger  variety  of  systems. 
Massive  databases  will  allow  comparison  of  results  across  different  samples,  sys- 
tems, and  even  laboratories,  while  increasingly  powerful  computational  approaches 
will  enable  large  systems  to  be  modeled  at  high  resolution.  Finally,  the  emerging 
field  of  synthetic  biology  will  enable  biophysics  to  extend  beyond  natural  systems 
to  novel  synthetic  systems. 


References 

1.  Bilokapic  S,  Schwartz  TU  (2012)  3D  ultrastructure  of  the  nuclear  pore  complex.  Curr  Opin 
Cell  Biol  24:86-91 

2.  Corbett  KD,  Harrison  SC  (2012)  Molecular  architecture  of  the  yeast  monopolin  complex.  Cell 
Rep  1:583-589 

3.  Boutet  S,  Lomb  L,  Williams  GJ,  Barends  TR,  Aquila  A,  Doak  RB,  Weierstall  U,  DePonte  DP, 
Steinbrener  J,  Shoeman  RL,  Messerschmidt  M,  Barty  A,  White  TA,  Kassemeyer  S,  Kirian  RA, 
Seibert  MM,  Montanez  PA,  Kenney  C,  Herbst  R,  Hart  P,  Pines  J,  Haller  G,  Gruner  SM,  Philipp 
HT,  Tate  MW,  Hromalik  M,  Koerner  LJ,  van  Bakel  N,  Morse  J,  Ghonsalves  W,  Arnlund  D, 
Bogan  MJ,  Caleman  C,  Fromme  R,  Hampton  CY,  Hunter  MS,  Johansson  LC,  Katona  G, 
Kupitz  C,  Liang  M,  Martin  AV,  Nass  K,  Redecke  L,  Stellato  F,  Timneanu  N,  Wang  D,  Zatsepin 
NA,  Schafer  D,  Defever  J,  Neutze  R,  Fromme  P,  Spence  JC,  Chapman  HN,  Schlichting  I 
(2012)  High-resolution  protein  structure  determination  by  serial  femtosecond  crystallography. 
Science  337:362-364 

4.  Petoukhov  MV,  Svergun  DI  (2013)  Applications  of  small-angle  X-ray  scattering  to  biomacro- 
molecular  solutions.  Int  J  Biochem  Cell  Biol  45(2):429-437 

5.  Annesley  TM  (2003)  Ion  suppression  in  mass  spectrometry.  Clin  Chem  49:1041-1044 

6.  Lengqvist  J,  Svensson  R,  Evergren  E,  Morgenstern  R,  Griffiths  WJ  (2004)  Observation  of  an 
intact  non-covalent  homotrimer  of  detergent- solubilised  rat  microsomal  glutathione  transfer- 
ase 1  by  electrospray  mass  spectrometry.  J  Biol  Chem  279(14):1331 1-13316,  M310958200 

7.  Barrera  NP,  Di  Bartolo  N,  Booth  PJ,  Robinson  CV  (2008)  Micelles  protect  membrane  com- 
plexes from  solution  to  vacuum.  Science  321:243-246 

8.  Pan  Y,  Stocks  BB,  Brown  L,  Konermann  L  (2009)  Structural  characterization  of  an  integral 
membrane  protein  in  its  natural  lipid  environment  by  oxidative  methionine  labeling  and  mass 
spectrometry.  Anal  Chem  81:28-35 

9.  Wen  JZ,  Zhang  H,  Gross  ML,  Blankenship  RE  (2009)  Membrane  orientation  of  the  FMO 
antenna  protein  from  Chlorobaculum  tepidum  as  determined  by  mass  spectrometry-based 
footprinting.  Proc  Natl  Acad  Sci  U  S  A  106:6134-6139 

10.  Sobott  F,  McCammon  MG,  Hernandez  H,  Robinson  CV  (2005)  The  flight  of  macromolecular 
complexes  in  a  mass  spectrometer.  Philos  Trans  A  Math  Phys  Eng  Sci  363:379-389,  discus- 
sion 389-391 

11.  Heck  AJR  (2008)  Native  mass  spectrometry:  a  bridge  between  interactomics  and  structural 
biology.  Nat  Methods  5:927-933 

12.  Abzalimov  RR,  Kaltashov  IA  (2010)  Electrospray  ionization  mass  spectrometry  of  highly 
heterogeneous  protein  systems:  protein  ion  charge  state  assignment  via  incomplete  charge 
reduction.  Anal  Chem  82:7523-7526 

13.  Bohrer  BC,  Mererbloom  SI,  Koeniger  SL,  Hilderbrand  AE,  Clemmer  DE  (2008)  Biomolecule 
analysis  by  ion  mobility  spectrometry.  Annu  Rev  Anal  Chem  1 :293-327 


12    Future  Prospects 


379 


14.  Damen  C,  Chen  W,  Chakraborty  A,  van  Oosterhout  M,  Mazzeo  J,  Gebler  J,  Schellens  J,  Rosing 
H,  Beijnen  J  (2009)  Electrospray  ionization  quadrupole  ion-mobility  time-of-flight  mass  spec- 
trometry as  a  tool  to  distinguish  the  lot-to-lot  heterogeneity  in  N-glycosylation  profile  of  the 
therapeutic  monoclonal  antibody  trastuzumab.  J  Am  Soc  Mass  Spectrom  20:2021-2033 

15.  Bagal  D,  Zhang  H,  Schnier  PD  (2008)  Gas-phase  proton-transfer  chemistry  coupled  with  TOF 
mass  spectrometry  and  ion  mobility-MS  for  the  facile  analysis  of  poly(ethylene  glycols)  and 
PEGylated  polypeptide  conjugates.  Anal  Chem  80:2408-2418 

16.  Collins  MO,  Choudhary  JS  (2008)  Mapping  multiprotein  complexes  by  affinity  purification 
and  mass  spectrometry.  Curr  Opin  Biotechnol  19:324-330 

17.  Monti  M,  Cozzolino  M,  Cozzolino  F,  Vitiello  G,  Tedesco  R,  Flagiello  A,  Pucci  P  (2009) 
Puzzle  of  protein  complexes  in  vivo:  a  present  and  future  challenge  for  functional  proteomics. 
Expert  Rev  Proteomics  6:159-169 

18.  Terentiev  AA,  Moldogazieva  NT,  Shaitan  KV  (2009)  Dynamic  proteomics  in  modeling  of  the 
living  cell.  Protein-protein  interactions.  Biochemistry  (Mosc)  74:1586-1607 

19.  Malik  R,  Dulla  K,  Nigg  EA,  Korner  R  (2010)  From  proteome  lists  to  biological  impact-tools 
and  strategies  for  the  analysis  of  large  MS  data  sets.  Proteomics  10:1270-1283 

20.  Zhou  M,  Robinson  CV  (2010)  When  proteomics  meets  structural  biology.  Trends  Biochem  Sci 
35:522-529 

21.  Gavin  AC,  Maeda  K,  Kuhner  S  (2011)  Recent  advances  in  charting  protein-protein  interac- 
tion: mass  spectrometry-based  approaches.  Curr  Opin  Biotechnol  22:42-49 

22.  Sardiu  ME,  Washburn  MP  (2011)  Building  protein-protein  interaction  networks  with  pro- 
teomics and  informatics  tools.  J  Biol  Chem  286:23645-23651 

23.  Sinz  A  (2010)  Investigation  of  protein-protein  interactions  in  living  cells  by  chemical  cross- 
linking  and  mass  spectrometry.  Anal  Bioanal  Chem  397:3433-3440 

24.  Zhu  Y,  Guo  TN,  Park  JE,  Li  X,  Meng  W,  Datta  A,  Bern  M,  Lim  SK,  Sze  SK  (2009)  Elucidating 
in  vivo  structural  dynamics  in  integral  membrane  protein  by  hydroxyl  radical  footprinting.  Mol 
Cell  Proteomics  8:1999-2010 

25.  Xie  J,  Schultz  PG  (2006)  A  chemical  toolkit  for  proteins — an  expanded  genetic  code.  Nat  Rev 
Mol  Cell  Biol  7:775-782 

26.  Narhi  L  (2012)  In:  Narhi  L  (ed)  (2013)  Biophysical  characterization  during  protein  therapeutic 
development.  Springer 

27.  Mahler  H-C,  Friess  W,  Grauschopf  U,  Kiese  S  (2009)  Protein  aggregation:  pathways,  induc- 
tion factors  and  analysis.  J  Pharm  Sci  98:2909-2934 

28.  Chen  S,  Lau  H,  Brodsky  Y,  Kleemann  GR,  Latypov  RF  (2010)  The  use  of  native 
cation-exchange  chromatography  to  study  aggregation  and  phase  separation  of  monoclonal 
antibodies.  Protein  Sci  19:1191-1204 


Index 


A 

AArATP  synthase,  315 
A-ATPase  synthase,  314,  315,  317-318 
ArATPase  synthase,  317-318 
Absorbance  spectra,  196 

amino  acids,  36 

chromophores,  39 

proteins,  39 
Actin 

atomic  structure,  343 

ATP  binding,  354 

conformational  changes,  349 

fiber,  1 1 

fluorescently  labeled  actin  filament,  351 
G- actin  structures,  343 
in  vitro  actin  gliding  velocity,  351 
and  myosin,  10 

non-muscle  myosin  motility,  352 
Actin  binding  site,  345,  353 
Actin-myosin  interaction,  351 
Active  unwinding.  See  DNA  unwinding 

mechanism 
Activity  silencing,  317 
Actomyosin 

force  production,  349 

mechanical  cycles,  346 

mechanochemical  cycle,  347 
ADC.  See  Analog-to-digital  conversion  (ADC) 
Adenosine  diphosphate  (ADP) 

actin  coupling,  354 

ADP.BeF,,  346 

ADP.VO4,  346 

bacterial  plasma/inner  mitochondrial 

membrane,  316 
equilibrium  mass  action  ratio,  325 
and  inorganic  phosphate  (Pi),  327,  328, 

346,  347 


Adenosine-5'-diphosphate.aluminum-fluoride 

(ADP.AIF4),  346 
Adenosine-5'-diphosphate.bery  Ilium-fluoride 

(ADP.BeFJ,  346 
Adenosine-5'-diphosphate.vanadate 

(ADP.VO4),  346 
Adenosine-5 '-triphosphate  (ATP) 
ATPase  cycle,  35 1 
hydrolytic  cycle,  342 
Adenosine  triphosphate  (ATP) 
enzymes,  273 

hydrolysis-driven  proton  pump,  316 

process,  ion  gradient-driven,  325 

protein  translocation,  266 

proton  gradient,  329 
Adenylyl-imidodiphosphate  (AMPPNP),  346 
ADP.  See  Adenosine  diphosphate  (ADP) 
AFM.  See  Atomic  force  microscopy  (AFM) 
A0-ion  channel,  315,  318 
Alignment  tensor,  138 
Allison,  D.P.,  261 
Allosteric  activation,  342 
Allosteric  regulation,  342 
Allostery,  342,  354 
Alzheimer's  disease,  27,  101 
Amide  bands,  64,  66 

Amide  exchange.  See  Hydrogen-deuterium 
exchange 

AMPPNP.  See  Adenylyl-imidodiphosphate 

(AMPPNP) 
Amyloid,  27 
Amyloid  fibrils,  101 
Analog-to-digital  conversion 

(ADC),  154,  158 
Analytical  ultracentrifugation,  343 
Angular  momentum,  spin,  164-165 
Anomalous  scattering  methods,  97 


N.M.  Allewell  et  al.  (eds.),  Molecular  Biophysics  for  the  Life  Sciences, 
Biophysics  for  the  Life  Sciences  6,  DOI  10.1007/978-1-4614-8548-3, 
©  Springer  Science+Business  Media  New  York  2013 


381 


382 


Index 


Araki,  A.,  61 

Archaeal  ATPase,  315 

Aromatic  amino  acids,  35-36,  40 

Asbury,  W.,  99 

Ashkin,  A.,  266 

Assembly,  342,  345 

Atomic  force  microscopy  (AFM) 

advantages,  260 

description,  260 

imaging  modes,  262 

principles,  260-261 

single  biomolecules,  262-263 

SMFS,  263-266 
ATP.  See  Adenosine  triphosphate  (ATP) 
ATPases 

actomyosin  cycle,  351 

allosteric  regulation,  342 

rotary  motor  (see  Rotary  motor  ATPase) 
ATP  binding  site,  345 
ATP  hydrolysis  by  myosin,  346-348 
ATP  synthase 

chloroplast,  315 

E.  coli,  320-321 

B 

Back  door  hypothesis,  349 
Back-exchange,  239,  240 
Bacterial  V- ATPase 

modeling  studies,  330 

single  molecule  rotation  experiments,  329 
Bacteriophage  T7 

ATP/GTP  unwinding,  298 

enzymological  studies,  293 

replicative  helicase,  307 
Baenziger,  J.E.,  67 
Bagshaw,  C.R.,  346,  348 
Bagshaw-Trentham  model,  348 
Bandwidth,  115-117 
Barth,  A.,  66 
Belfort,  G.,  68 
Bernard,  C,  3 
P-sheet,  349 

Betterton,  M.D.,  303,  305 
Biemann,  K.,  220 
Biological  macromolecules,  11-12 
Biological  motility,  342 
Biological  motor  system,  342 
Biomolecular  machines,  315 
Biomolecular  structure,  114,  216 
Biophysics 

computation  role,  10-11 

description,  2-3 


electron-electron  spin-spin  interactions, 
203-211 

hemoglobin  three-dimensional  structure,  6-7 
hyperfine  coupling  and  spin-density 

distributions,  200-203 
molecular  biophysics,  4 
NMR,  166 

proteins,  atomic  resolution,  6 

rotational  catalysis,  326-327 

spin-orbit  coupling  and  g- value  anisotropy, 
198-200 
Biro,  N.A.,  342 
Boltzmann  distribution,  119 
Bonhoeffer,  K.F.,  236 
Boyer,  P.D.,  326 
Brownian  ratchet,  330-332,  352 
Burley,  R.W.,  236 
Bustamante,  C,  267 


C 

13C,  123-124,  126,  130 
Ca2+,  342,  353 

CAD.  See  Collision-induced  dissociation  (CID) 
Calmodulin  family,  343,  353 
CAN  experiment,  160 
Cardiac  muscle,  345,  353 
Carr-Purcell-Meiboom-Gill  experiment 

(CPMG),  151 
CBCA(CO)NH  experiment,  131,  159 
CC-COSY  experiment,  160 
CCD.  See  Charge-coupled  device  (CCD) 
CC-TOCSY  experiment,  160 
CD.  See  Circular  dichroism  (CD) 
Cell  biology,  2,  14,  59,  270,  281,  342,  368 
Cell  division,  352-353 
Cell  migration,  352-353 
Cellular  protrusions,  352-353 
Central  stalk 

lipid  vesicle,  323 

and  stator  stalk,  318 

synthesis/hydrolysis,  320 
Chaperone,  266 
Chaperonins,  280-281 
Charge-coupled  device  (CCD),  272,  277 
Charge  state  distribution 

determination,  protein  mass,  216 

protein  compactness,  solution,  234-236 
Chattopadhyay,  A.,  58 
Chemical  cross-linking 

DNA  higher  order  structure,  247 

MS  detection,  372 

proteins,  243-245 


Index 


383 


Chemical  energy,  342 
Chemical  labeling 

chemical  and  nonselective  (oxidative),  246 

FPOP,  247 

selective  modification,  245 
Chemical  shift 

and  nuclear  shielding,  121-123 

paramagnetic  effects,  123-124 

perturbation  and  line  width  changes, 
147-148 
Chemiosmotic  hypothesis,  315 
Chittur,  K.K.,  65 
Chromophore 

aromatic  amino  acid,  39 

CD  spectra,  46 

peptide,  35 
CID.  See  Collision-induced 

dissociation  (CID) 
Circular  dichroism  (CD) 

conformational  changes,  proteins,  46-47 

DNA,  47-48 

protein  and  DNA,  50-51 

protein  binding,  49 

quantitative  comparison,  48 

secondary  structure,  43-44 

SRCD,  50 

tertiary  structure,  45 

theory,  42-43 

VCD,  50 
Coherence  (in  NMR),  124 
Coherence  transfer 

J-coupling,  124 

NMR,  124 
Coiled  coil,  321,323,344 
Collision-activated  dissociation  (CAD). 
See  Collision-induced 
dissociation  (CID) 
Collision-induced  dissociation  (CID),  219 
Computational  model  fitting,  300-302 
CON  experiment,  160 
Conformation 

aromatic  side  chains,  49 

changes,  proteins,  46 

DNA,  47 

FRET,  58 

proteins  and  nucleic  acid,  36 

therapeutic  proteins,  65 
Conformational  change,  349 
Conformational  diseases,  249 
Conjugate  peak  refinement  (CPR),  350 
Connexin,  262-263 
Conserved  loop,  348 
Contact  shift,  123 
Contractile  apparatus,  350 


Contractile  properties,  342 

Contrast  transfer  function  (CTF),  279 

Converter,  345,  348 

Correlated  spectroscopy  (COSY) 

experiment,  126 
Correlation  time,  144,  150 
Coupling,  dipolar 

NOE  (see  Nuclear  overhauser 

effects  (NOE)) 
residual  (RDC)  (see  Residual  dipolar 

couplings  (RDC)) 
Coupling,  scalar,  124 

CPR.  See  Conjugate  peak  refinement  (CPR) 
Crick,  E,  4,  5,  19,  99,  101 
Cross-P,  101 
Crossbridge,  343,  345 
Cross-correlated  relaxation  enhanced 

polarization  transfer  (CRINEPT) 
experiment,  144,  160 
Cross-linking.  See  Chemical  cross-linking 
Cross-polarization,  139,  141-142 
Cryo-electron  microscopy  (cryo-EM) 

actomyosin  rigor  complex,  348-349 

A/V-type  ATPase,  323 

biological  macromolecules,  275 

and  enzyme  decoration,  323 

rotary  ATPases,  315 
Cryo-EM.  See  Cryo-electron  microscopy 

(cryo-EM) 
Crystallography 

ATPase  sectors,  rotor  rings  and  stator 
stalks,  318-321 

near- atomic  resolution  structure,  315 
Crystals 

distribution,  electrons,  92 

three-dimensional  diffraction  grating,  93 
Crystal  structure,  344,  345 
CTF.  See  Contrast  transfer  function  (CTF) 
CWEPR  spectra,  186 
Cytoskeletal  transporter,  353 
Cytoskeleton,  367-368 

D 

D' Antonio,  J.,  69 

da  Vinci,  Leonardo,  3 

DEER.  See  Double  electron  electron 

resonance  (DEER) 
Delbruck,  M.,  4-7 
Dempsey,  C.E.,  237 
Deoxyribonucleic  acid  (DNA) 

helical  structure,  19 

and  proteins,  21 

andRNA,  18 


384 


Index 


Deuteration  (of  macromolecules  for  NMR),  156 
Deuterium-neutron  scattering,  107-109 
Dichroism,  circular,  8 
Differentiation,  353 
Diffraction,  277 

Diffusion  (NMR  measurement  of),  146 
Digitization,  NMR  signal,  131 
Dimerization,  352,  353 
Dipolar  coupling,  190,  197,  203 
Dipolar  coupling  and  nuclear  Overhauser 

effects,  133-137 
Dipolar  relaxation,  149 
Dipole,  nuclear  spin,  124 
Dissociation  constant  (Kd),  297 
Distribution 

charge  state  (see  Charge  state  distribution) 

isotopic,  219,  239 
Disulfide  bond 

aromatic  amino  acid  chromophores,  39 

circular  dichroic  spectroscopy,  8 

PTM,  221 

sequence  coverage,  240 
Disulfide  bridge.  See  Disulfide  bond 
DLS.  See  Dynamic  light  scattering  (DLS) 
DNA-based  motor  proteins,  352 
DNA  diffraction,  99,  104 
DNA  polymerase,  260,  308-309 
DNA  replication.  See  Replication 
DNA  unwinding 

active  vs.  passive  mechanism,  303-305 

computational  model,  300-302 

polymerase  (see  DNA  polymerase) 

primase  (see  Primase) 

processivity,  302 

rate  (Ku) 

average  base  pair,  302 
ensemble  methods,  293 
and  Kd,  302 
T7  helicase,  308 
DNA  unwinding  mechanism 

active,  303-305 

passive,  303-305 
Docking,  343 
Domain 

actin  monomer,  343 

sequence  information,  352 

tails,  myosins,  353 
Double  electron  electron  resonance 

(DEER),  190 
Double-quantum  filtered  (DQF)  COSY 

experiment,  126 
Drag  force,  354 
Drug  development,  373,  374 
Duty  ratio  (duty  cycle),  354 


Dynamic  information,  NMR 

chemical  shift  perturbation  and  line  width 
changes,  147-148 

crystallization,  145 

diffusion  measurements,  146 

EXSY  and  zz-exchange  spectroscopy,  149 

H/D  exchange,  147 

heteronuclear  relaxation,  149-152 

macromolecular  motions,  145 

solid  state  2H  line  shape  analysis,  152 
Dynamic  light  scattering  (DLS) 

cytochrome,  79-80 

hydrodynamic  radius,  78-79 

protein  concentrations,  80 
Dynamic  range  (NMR),  158 
Dynein,  11 

E 

ECD.  See  Electron  capture  dissociation  (ECD) 

Effector  binding  domain,  353 

Eisenberg,  D.,  101 

Elastase,  108,  109 

Elastic  energy  coupling,  330 

ELC.  See  Essential  light  chain  (ELC) 

Electrochemical  potential,  315-316 

Electron  capture  dissociation  (ECD),  220 

Electron  density  distribution,  93 

Electron  density  equation,  104,  105 

Electron-electron  spin-spin  interactions 

high  spin  metal  center  systems,  208-21 1 

light-induced  radical  pairs,  204-206 

molecular  triplet  states,  206-207 

site-directed  spin  labeling,  203 
Electron  micrograph,  343 
Electron  microscopy  (EM) 

molecular  machines,  315 

projection  images,  3D  reconstructions, 
322-324 

single-molecule  methods,  258 

structure  and  stoichiometry  analysis,  314 
Electron-nuclear  double  resonance 

(ENDOR),  190 
Electron  paramagnetic  resonance  (EPR) 

biophysics  (see  Biophysics) 

bound  electrons,  177 

characteristic  lineshapes 

rapid  tumbling  regime,  192-194 
rigid  limit  regime,  195-198 

experimental  techniques  (see  Experimental 
methods) 

field  modulation  detection,  178,  179 

interaction  energy,  177 

magnetization,  179-180 


Index 


385 


muscle  fibers,  345 
physical  principles,  Zeeman 
interaction,  177 

Electron  spin-echo  envelope  modulation 
(ESEEM),  189,  190 

Electrophoresis,  343 

Electrospray  ionization  (ESI) 
charge  state  distribution,  216 
convoluted  process,  216 
direct  measurements,  232-234 
non-covalent  interactions,  232-234 
positive  ion  mode,  216-217 
spectra,  intact  proteins,  216 

EM.  See  Electron  microscopy  (EM) 

Endocytosis,  337 

ENDOR.  See  Electron-nuclear  double 

resonance  (ENDOR) 
Energetics,  350 
Energy  coupling 
F-ATPase,  327 
proton  channel,  322 
and  rotational  catalysis,  330 
Energy  transduction,  349 
Ensemble  unwinding  assays 
assembly,  helicase,  295-296 
DNA  substrate,  294,  295 
fluorescence 

DNA  strand  separation,  294 
stopped-flow  assay,  295 
radiometric,  294 
steady  state  vs.  pre- steady 
state  kinetics,  294 

Enthalpy 

and  entropy,  24 

free  energy,  26-27 

products  and  reactants,  24 

temperature,  26 
Entropy 

and  enthalpy,  24 

negative  change,  24 
Enzymatic  cycle,  348,  351 
Enzyme  efficiency,  317,  327 
Enzymes 

chemical  mechanisms,  264 

electron  transfer  reactions,  200 

metabolism,  11-12 

molecular  motors.,  342 

synchronization,  294 
EPR.  See  Electron  paramagnetic 

resonance  (EPR) 
Equilibrium  constant,  24-25 
ESEEM.  See  Electron  spin-echo  envelope 
modulation  (ESEEM) 


ESI.  See  Electrospray  ionization  (ESI) 
Essential  light  chain  (ELC),  343 
Eukaryotic  cell,  314 
Exchange,  chemical,  147 
Exchange  spectroscopy  (EXSY),  149 
Excited  state 

fluorophore,  5 1 

and  ground,  35 

molecules,  34 
EX1  exchange  regime,  238,  242 
EX2  exchange  regime,  238 
Exocytosis,  260 
Experimental  methods 

field  modulation  lock-in 
detection,  186 

magnetic  field  and  microwave 
frequency,  181 

pulsed  EPR,  186-190 

relaxation,  183-185 

resonator  design,  182-183 

sensitivity  and  consequences,  181-182 

transient  EPR,  190-188 
External  force,  342,  352 
External  load,  351 
Extinction  coefficient,  39,  40 
Extrinsic  fluorescence,  56-57 

F 

Fabian,  H.,  67 

F-actin.  See  Filamentous  actin  (F-actin) 
Faraday  constant,  316 
Fast  photochemical  oxidation  of  proteins 
(FPOP),  247 

F-ATPase 

animal  mitochondria,  315 

atomic  resolution  X-ray  structures,  321 

ATP  synthesis,  316 

binding  change  mechanism,  326 

crystal  structure,  319 

function,  bacteria,  314 
FrATPase 

catalytic  mechanism,  327 

crystallographic  structure,  322 

fluorescence  anisotropy  relaxation 
measurements,  327 

mitochondria-rich  animal  tissues,  318 
Fenn,  J.B.,  216 
Fenn,  W.O.,  350,  351 
Fermi,  G.,  6 
FjFq-ATP  synthase 

detergent- solubilized  membranes,  323 

rotary  ATPases,  316 


386 


Index 


Fiber  diffraction 

experimental  arrangement,  99,  100 
fibrous  biological  materials,  99 
highly  ordered  gels,  102 
molecular  dynamics  and  energy 

minimization,  101 
molecular  transitions,  103 
natural  Syrian  hamster,  101,  102 
structure,  TMV,  102,  103 
FID.  See  Free-induction  decay  (FID) 
Filament  assembly,  345,  353 
Filamentous  actin  (F-actin),  343,  346 
Filamentous  viruses,  100,  102 
FIONA.  See  Fluorescence  imaging  with 

one  nanometer  accuracy  (FIONA) 
Florin,  E.-L.,  263 
Fluorescence 

microscopy,  59-60 
quenching,  56 
spectroscopy 

description,  51-52 
extrinsic,  56-57 
FRET,  58-59 
intrinsic,  53-54 
molecule,  52 

protein  conformation,  54-55 

quenching,  56 

red  edge  excitation,  58 

solid-state,  60-62 

time-resolved  fluorescence 
spectroscopy,  53 
Fluorescence  imaging  with  one  nanometer 

accuracy  (FIONA),  273,  352 
Fluorescently-labeled  nucleotide,  352 
Fluorescent  probe,  345 
Fluorophore,  270-271 
Folding.  See  Protein  folding 
Force 

manipulation  techniques,  342 

powerstroke,  348 

production,  349-350 
Force-clamp,  264-265,  268 
Force-feedback,  268 
Force-generating  step,  343,  350 
Force  generation 

kinetic  pathway,  348 

and  motility,  350-352 
Force  manipulation,  342,  350 
Force  production,  349,  350 
Force-velocity  relationship,  350 
Forster/Fluorescence  resonance  energy 
transfer  (FRET) 

dipole-dipole  coupling,  52 


nanometer  distance,  310 

optical  spectroscopic  methods,  58-59 

ring-shaped  replicative  helicases,  309 

single-molecule,  273 

and  smFRET,  329 

subunit  rotation,  328 
Fourier  transform  (FT) 

cyclotron  frequencies,  227 

electron  density  distribution,  93 

image  processing,  279 

scattering  curve,  105-106 
Fourier  transform  infrared  (FTIR) 

amide,  64-65 

clinical  diagnostics,  65 

data  analysis,  68-69 

difference  spectroscopy,  66 

H/D  exchange,  66-67 

IR  spectroscopy,  64 

peptide  backbone,  63 

protein  folding  and  stability,  67-68 

protein  secondary  structure  analysis,  65 

proteins,  lipids  and  nucleic  acids,  64,  65 

tertiary  structure,  66 
Fourier  transform  ion  cyclotron  resonance  MS 
(FT  ICR  MS) 

broadband  excitation  and 
detection,  225,  226 

electrostatic  and  magnetic  fields,  225 

high-resolution  mass  analysis,  227 

ion  fragmentation  techniques,  227 

mass  analyzer,  225 

orthogonal  ion  fragmentation 
techniques,  227 

ultra-high  mass  resolution  and 
accuracy,  226 

unsynchronized  motion,  226 
Fourier  transform  MS  (FT  MS).  See  Fourier 
transform  ion  cyclotron  resonance 
MS  (FT  ICR  MS) 
FPOR  See  Fast  photochemical  oxidation 

of  proteins  (FPOP) 
F0-proton  channel,  326 
Fragment  ion.  See  Ion 
Franklin,  R.E.,  4,  5,  99 
Free-electron  lasers,  369 
Free  energy,  3 1 5-3 1 6 
Free-induction  decay  (FID),  116 
French,  D.L.,  67 

FRET.  See  Forster/Fluorescence  resonance 

energy  transfer  (FRET) 
FT  ICR  MS.  See  Fourier  transform  ion 

cyclotron  resonance  MS 

(FT  ICR  MS) 


Index 


387 


G 

G-actin.  See  Globular  actin  (G-actin) 
Gas  constant,  25 
Geeves,  M.A.,  349 
Grit,  301 

Gibbs  free  energy,  24 
Gliding,  351 

Globular  actin  (G-actin),  343 
Glycogen,  18 
Glycolipids,  18,  20 
Glycoproteins,  18,  19 
Glycosylated  proteins,  371 
Gold  nanorods,  329-330 
Griebenow,  K.,  65 

GTP-dependent  signaling  system,  342 
Guanosine-5'-triphospliate  (GTP),  342 
Guinier,  A.,  104,  105 
Guinier  approximation,  104 

H 

JH,  114-115,  122 

Half  channel 

accessibility  experiments,  331 

proton  pathway,  323 

T.  thermophilus  ATPase,  331 

Hanson,  J.,  343 

Hare,  D.R.,  162 

Hartmann-Hahn  condition,  126,  131,  141 
Harvey,  W.,  3 
H+- ATPase,  317 

HCCH-TOCSY  experiment,  126,  131, 

144,  156,  159 
H/D  exchange.  See  Hydrogen-deuterium 

(H/D)  exchange 
HDX.  See  Hydrogen-deuterium  exchange 

(HDX) 
Heat,  24,  26 
Heavy  chain,  343-345 
Heavy  chain  fragment,  345 
Heavy  meromyosin  (HMM),  343-346 
Heck,  A.J.R.,  371 
He,  R,  38 

Helical  biopolymers,  99 
Helicases,  13 
Hemi  channel,  331 
Hemoglobin,  23 
Hershey,  A.,  6,  7 

Heteronuclear  relaxation,  149-152 
Heteronuclear  single-quantum  coherence 
(HSQC)  experiment 

NOESY  experiment,  134 

polarization  transfer,  131 


Hexameric  helicases 

dsDNA  translocation,  292 

translocation,  ssDNA,  292 
Higher  order  structure 

DNA,  247 

and  dynamics,  RNA,  248 

High  throughput  automation,  377-378 

Hillenkamp,  R,  218 

Hirschfeld,  T.,  270 

HNCACB  experiment,  129,  142,  160 

HNCA  experiment,  129,  130 

HN(CO)CA  experiment,  129,  131,  160 

Hodgkin,  A.,  4 

Hodgkin,  D.C.,  4,  5 

Holmes,  K.C.,  349 

Hookean  spring,  266-267 

Horizontal  gene  transfer,  314 

Huxley,  A.R,  4,  343 

Huxley,  H.E.,  343 

Hvidt,  A.,  236 

Hydrogen  bonds,  21,  23 

Hydrogen-deuterium  exchange  (HDX) 
backbone  amide  hydrogen  atoms,  237 
conformational  dynamics,  239,  240 
dynamic  structure  analysis,  66-67 
evolution,  deuterium  content,  242 
exchange-incompetent  state  of  the 

protein,  238 
hydrogen  bonding  network,  236 
intrinsic  exchange,  237 
measurements,  238-239 
NMR,  147 

protein  higher  order  structure,  239,  240 

receptor  binding  interface,  241 

spatial  resolution,  239 

top-down  approach,  measurements, 
242-243 

types,  reactions,  236 
Hydrogen  exchange.  See  Hydrogen-deuterium 

exchange  (HDX) 
Hydrolysis 

ATP,  346 

products  (see  Hydrolysis  products) 

Hydrolysis  products,  342,  346,  347,  349 

Hydrolysis  step,  347 

Hydrophobic  bonds,  23,  27 

Hydrophobic  interaction 

chromatography,  368,  375 

Hyperfine  coupling 

biological  samples  experiments,  201 
Fermi  contact  term,  201 
interaction,  organic  cofactors,  200 
partially  resolved  splitting,  199,  202 


388 


Index 


Hyperfine  coupling  (cont.) 

photo  synthetic  reaction  centers,  201 
solution  spectrum,  nitroxide  radical,  194 
spin-density  distributions,  200-203 
valence  electrons,  201 


Imaging 
AFM,  262 

molecular  motors,  260 
Inner  membrane,  316,  321 
In-phase  anti-phase  (IPAP)  experiment,  161 
Instrumental  analysis,  215 
Instrumentation 

electronics  of  NMR,  154 

macromolecular  NMR  spectroscopy,  152 

solid-state  NMR  probes,  153-154 

solution- state  NMR  probes,  153 
Intact  muscle,  350 
Integral  membrane  proteins,  368 
Intracellular  transport,  23 
Intrinsic  exchange  rate 

amide  hydrogen  atoms,  239 

types,  labile  hydrogen  atoms,  237 
Intrinsic  fluorescence,  53-54 
Intrinsic  Trp  fluorescence,  346 
In  vitro 

actin  gliding  velocity,  351 

and  in  vivo,  345 

motility  assay,  351 
In  vivo,  345,  351 
Ion 

charge  state  distribution,  234-236 

ESI  MS,  216-217 

fragment 

electron-based  techniques,  231 
sequence  coverage, 

macromolecular  ions,  227 

MALDI,  217-218 

molecular,  216-218 

multiply  charged,  216 
Ion  channel 

catalytic  sites,  326 

F-ATPase,  323 

proteolipid  subunits,  317 
Ion  exchange  chromatography,  375 
Ion  gradient,  316,  325,  326 
Ionization 

applications,  216 

matrix  and  analyte  molecules,  218 

sequence  information,  229 
Ionization  source,  229 
Ion  mobility  (IM)  MS,  372 
Ion  motive  force  (IMF),  316 


Ion  trap  MS 

FT  ICR  MS,  225,  226 
mass  analyzer,  225 

quadrupole  and  triple  quadrupole,  222-224 
IQ  motif,  353 
Isometric  force,  351 
Isotope 

abundance,  221,  228 

distribution,  218-219 
Isotope  exchange,  346 
Isotope  labeling 

description,  155 

25  kDa  molecular  mass,  156 

proteins,  156 

J 

J-coupling.  See  Coupling,  scalar 
J-coupling  and  coherence  transfer 

coherence  transfer,  127 

Hartmann-Hahn  matching,  126 

nuclear  spins,  124 

one  bond  couplings,  126 

TOCSY,  126,  128 

TROSY-HSQC,  124-125 
Julicher,  K,  303,  305 


K 

Kalonia,  D.S.,  60 
Kamerzell,  T. J.,  67,  68 
Karas,  M.,  218 
Kelly,  S.M.,  44 
Kendrew,  J.,  6 

Keratin  (wool)  diffraction,  99,  100 
Kim,  J.,  59 
Kinesin,  11 

motor,  268 

and  myosin,  272 
Kinetic  pathway  selection,  348 
Kinetic  resolution,  346-350 
Kinetics 

pre-steady  state,  294 

steady  state,  294 

and  thermodynamics,  24-25 

trace 

estimation,  Lm,  299 
helicase-DNA  complex,  297-299 
Kinetic  step  size,  300 
Kinetic  study,  342,  346 
Kinetic  transition,  350 
Kinetochores,  367 
Klewpatinod,  M.,  49 
Klibanov,  A.M.,  65 
Kratky  Plot,  104,  105 


Index 


389 


L 

Lab  on  a  microchip,  377 

Lakowicz,  J.R.,  58 

Lambert-Beer  law,  38,  40 

Larmor  frequency,  114,  116 

Laser  beam,  266-267 

Laser  induced  liquid  bead  ion  desorption 

(LILBID),  324 
Latypov,  R.F.,  55 
Leeuwenhoek,  A.,  3,  4 
Length  clamp,  263-265 
Length-jump  experiments,  350 
Lever 

movement,  345 

priming,  349 

swing  (see  Lever  swing) 
Lever  arm  hypothesis,  351 
Lever  length,  35 1 
Lever  movement,  345,  348,  349 
Lever  orientation,  345,  347 
Lever  priming,  348,  349 
Lever  swing,  346,  347,  349 
Levin,  M.K.,  302 
Light  absorbance 

biological  macromolecules,  38 

chromophores,  41 

concentration  determination,  40-41 

detection,  UV  region,  39-40,  42 

nucleic  acids,  39 

protein  spectrum,  38 

structural  determination,  41,  42 
Light  chain  of  meromyosin,  343-345,  353 
Light-induced  radical  pairs 

antiphase  doublet  split,  204 

energy  level  diagram  and  spectra,  204,  205 

out-of-phase  echo  modulation  experiment, 
205,  206 

polarization  pattern,  204 

population  distribution  and  dynamic 
properties,  204 

W-band  spectra,  204,  205 
Light  scattering 

applications,  80-81 

DLS,  76,  78-80 

electromagnetic  radiation,  36 

QELS,  76 

Raleigh  ratio,  37 

static  light  scattering,  77-78 

techniques,  81 
Light  spectroscopy 

amino  acids  electronic  transitions,  35-36 

description,  33 

electron,  34 

Jablonski  diagram,  34-35 


Planck  relation,  34 

transition  dipole  moment,  35 

tryptophan,  36 
LILBID.  See  Laser  induced  liquid  bead  ion 

desorption  (LILBID) 
Limited  proteolysis,  343 
Linderstrom-Lang,  K.,  236 
Lineshape  analysis,  152 
Lipid  bilayer,  316,  321,329 
Lipid  membrane,  321,  332 
Lipids,  18,  20,  23 
Liposome,  329 

Liquid  chromatography  (LC)-MS 
protein  sequence  information,  228 
proteolytic  fragments,  240 

Load,  350-351,353 

Load-bearing  myosin,  354 

Load-dependent  kinetics,  351 

Loop  1,345,349 

Loop  2,  345 

Low-angle  scattering,  103 

Low-angle  X-ray  diffraction,  343 

Lower  50  kDa  subdomain  (L50),  345,  348 

Lucius,  A.L.,  300,  302 

Ludtke,  S.J.,  280 

Lumry-Eyring  framework,  28 

Luria,  S.,  6,  7 

Lymn,  R.W.,  346-348,  354 

Lymn-Taylor  model,  347,  348 

Lysozyme,  93,  96 

M 

Macromolecular  assembly 

biological,  22-23 

gas  phase,  232 

hydrogen  bonds,  23 
Macromolecular  crystallography 

anomalous  scattering  techniques,  97 

biochemical  research,  92 

crystallography  biology,  99 

electron  density  map,  94 

graphical  applications,  97 

isomorphous  replacement,  97 

myoglobin  and  hemoglobin,  98 

Ramachandran  plot,  hen  egg  white 
lysozyme,  96 

resolution,  95 

stereochemical  restraints,  95 

structural  determination,  94 

three-dimensional  diffraction  grating,  93 

X-ray  crystallographic  study,  92,  93 
Macromolecular  interactions,  275 
Magic-angle  spinning  (MAS),  139,  141 


390 


Index 


MALDL  See  Matrix-assisted  laser  desorption/ 

ionization  (MALDI) 
MALDI  matrix,  218 
Mantele,  W.,  66 

MAS.  See  Magic- angle  spinning  (MAS) 
Mass 

analyzers  (see  Mass  analyzers) 
average 

atomic  masses,  stable  isotopes,  218 
kinetic  energy,  ions,  232 
measurements,  218-219 
monoisotopic,  219 
most  abundant,  219,  221 
spectrometry  (see  Mass  spectrometry 
(MS)) 
Mass  analyzers 

FT  ICR  MS,  225-227 
quadrupole,  triple  quadrupole  and  ion  trap, 
222-224 

TOF  MS  and  hybrid  quadrupole,  224-225 
Mass  resolution,  219,  222,  224 
Mass  spectrometry  (MS) 

architecture  and  dynamics,  biological 

molecules,  249 
characterization,  macromolecular 

assemblies,  371 
covalent  structure,  polypeptides  and 

proteins,  227-231 
heterogeneous  proteins,  371-372 
higher  order  structure,  biopolymers, 

247-248 

hydrogen/deuterium  exchange  MS, 
236-243 

in  vivo,  372-373 

membrane  proteins,  370-371 

structural  information,  324-325 

tandem,  219-221 
Mass-to-charge  ratio  (m/z),  216,  218 
Matrix-assisted  laser  desorption/ionization 
(MALDI) 

biopolymer  ions,  218 

UV  light- absorbing  small  organic 
molecules,  218 
Mean  residue  ellipiticity,  43 
Mechanical  energy,  342 
Mechanical  manipulation,  342,  350 
Mechanical  work,  343 
Mechanobiochemistry,  342 
Mechanochemical  cycle,  347,  349,  354 
Mechanoenzyme,  350 
Membrane  complexes,  367 
Membrane  potential,  316 
Membrane  proteins 

classical  biochemical  methods,  324 

structure,  258 


Membrane  trafficking,  353 
Metabolites,  18 
Metalloproteins 

Heliobacterium  modesticaldum,  210 

Kramer's  doublet,  208-209 

powder  spectrum  features,  209 

X-band  EPR  spectrum, 

horse  myoglobin,  211 

Zeeman  splitting,  208 

zero-field  splitting,  208 
Methot,  N.,  67 
Microscopy,  351-353 
Microtubules,  367 
Middaugh,  C.R.,  67,  68 
Milligan,  R.A.,  354 
Mitchell,  R,  315 
Mitochondria 

biochemical  experiments,  322 

macrolide  oligomycin,  314-315 

multi-site  catalysis,  315 
Models,  use  of,  12 
Molecular  biophysics 

atomic-level  structural  methods,  8-9 

carbohydrates,  18,  20 

cell,  18 

circular  dichroic  spectroscopy,  8 

description,  7 

DNA  and  RNA,  18,  19 

fluorescence  spectroscopy,  8 

mass  spectrometry,  9 

proteins,  18,  19 

single  molecule  methods,  9-10 

and  structural  biology,  215 

three  dimensional  structure,  DNA,  19,  21 
Molecular  dynamics,  350 
Molecular  envelope 

scattering  profile,  105 

scattering  signal,  104 
Molecular  forces,  351 
Molecular  genetics,  342,  348 
Molecular  machines 

bovine  mitochondrial  Fl  ATPase,  12,  13 

DnaB  hexameric  helicase,  12-13 
Molecular  modeling,  368 
Molecular  motor,  272,  315,  342 
Molecular  property,  352-354 
Molecular  replacement,  97,  98 
Molecular  structure 

neutron  crystallography,  108 

nucleic  acids,  5 

and  supramolecular,  343-346 
Molecular  triplet  states,  197,  206-207 
Molecular  tweezers,  10 
Molten  globule,  27 
Momentum  transfer  vector,  104 


Index 


391 


Motility,  342,  350-352 

Motor  activity,  352 

Motor  domain  (MD),  345,  347 

Motor  enzyme.  See  Rotary  motor  ATPase 

Motor  proteins 

double- stranded  (ds)  unwinding,  291 
step  size  and  stepping  rate,  300 

Motor-track  systems,  354 

Movement  of  myosin  lever,  348 

MS.  See  Mass  spectrometry  (MS) 

MS/MS.  See  Mass  spectrometry,  tandem 

Muller,  D.J.,  263 

Multiple  isomorphous  replacement,  97 
Muscle 

fibers,  350-351 

seminal  discoveries,  342 
Muscle  contraction,  343,  350,  353 
Muscle  fiber,  351 
Muscle  myosin  2,  345,  353 
Mutagenesis,  348 
Myofibril,  343 
Myosin 

actin  fiber  drives  muscle  contraction,  1 1 
active  site,  348,  349 
holoenzyme,  343-344 
isoform,  349,  351-353 
molecular  motors,  266 
molecular  tweezers,  10 
superfamily,  353 
Myosin  1,  352 

Myosin  2,  344-345,  349,  353 
Myosin  5,  349,  353,  354 
Myosin  6,  353 
Myosin  10,  353 


N 

15N,  122,  123-124,  130 
Nano-ESI.  See  Nanospray  ionization 
Nanomachine,  258 
Nanometer,  260,  261 
Nanonewton,  260 
Nanoscale,  257 
Nanospray  ionization,  215 
Narhi,  L.O.,  67 

Native  mass  spectrometry,  324-325 

Naumann,  D.,  67 

NCACX  experiment,  142 

NCA  experiment,  142 

NCOCX  experiment,  142 

NCO  experiment,  142 

Neck  of  myosin,  345,  351,  353,  354 

Negative  stain,  323,  325 

Neutron  crystallography,  108-109 


Neutron  diffraction,  107,  108 
Neutron  scattering 

and  diffraction  methods,  107,  108 
and  X-ray 

automated  structural  determination,  367 
data  collection,  366 
diffraction  methods,  367 
macromolecular  crystallography, 

368-369 
methods,  369 
small  angle,  369 
structural  biology,  367-368 
Niedergerke,  R.,  343 

NMR.  See  Nuclear  magnetic  resonance  (NMR) 
NOESY  experiment 

acireductone  dioxygenase,  134 
HSQC,  135 
Non-muscle  cell,  345 
Non-muscle  myosin  2,  353,  354 
Normalized  spatial  discrepancy  (NSD),  107 
Normal  mode  analysis  (NMA),  350 
NSD.  See  Normalized  spatial  discrepancy 

(NSD) 
NTP  hydrolysis 

biochemical  and  structural,  292 
double- stranded  (ds)  unwinding,  291 
Nuclear  magnetic  resonance  (NMR) 

data  processing  and  assignment  software, 

162-163 
description,  114 
excitation,  transitions 
carrier  frequency,  115 
coherent  ensemble,  116,  118 
Larmor  frequency,  116 
pulse  length  and  excitation  bandwidth, 

116,  117 
pulse  phase,  116 
radiofrequency,  115 
isotope  labeling,  155-157 
nuclear  spin,  114 
nucleic  acids,  160-161 
residual  dipolar  coupling  and  diffusion 

measurements,  161-162 
resonance  frequencies,  114-115 
sample  requirements,  solution,  154-155 
solution  state,  160 
solvent  suppression,  158-159 
spin-lattice  (Tl)  relaxation,  118-119 
structural  and  dynamic  analysis,  163-165 
structural  information,  9-10 
structural  information 

chemical  shift  and  nuclear  shielding, 

121-123 
D-couplings,  141-142 


392 


Index 


Nuclear  magnetic  resonance  (NMR)  (cont.) 
dipolar  coupling  and  nuclear 

overhauser  effects,  133-137 
gyromagnetic  ratios,  121 
HNCA  pulse  sequence,  130 
isotopes,  121 

macromolecules  and  membrane-bound 

proteins,  143-145 
paramagnetic  effects,  chemical  shift, 

123-124 
polarization  transfer,  131 
resonance  assignments,  142 
solid-state,  139 
valine  spin  system,  126-129 
type- selective,  residue- selective  and 
segmental  labeling,  157 
Nuclear  overhauser  effects  (NOE) 
signal  intensity,  134 
transfer  of  polarization,  203 
Nuclear  pore  complexes,  367 
Nucleic  acid-based  molecular  motors,  342 
Nucleotide  affinity,  349 
Nucleotide  analog,  345-346 
Nucleotide  binding,  317,  319 
Nucleotide  binding  site,  345,  348,  353 
Nucleotide  exchange  factor,  342 
Nylon  beads,  327,  330 

O 

Oberhauser,  A.F.,  265 
lsO  exchange,  326 
Ohgushi,  M.,  27 
Okten,  Z.,  274 
Oligomycin,  315 
Optical  spectroscopy 

CD,  42-51 

fluorescence,  51-63 

FTIR  (see  Fourier  transform  infrared  (FTIR)) 
light 

absorbance,  38-42 
scattering,  36-37 
spectroscopy,  33-36 
Raman  spectroscopy,  69-76 
Optical  trap,  351 
Optical  tweezers 

biological  applications,  268-269 
DNA  handle,  268 
molecular  motors,  266 
operation  principles,  266-267 
single-molecule,  268 
Orbitrap  MS,  225 


Organic  radicals 

Q-band  spectrometers,  181 
slow  relaxing  systems,  184 
spin-orbit  coupling,  199,  200 

P 

31P,  121,  132-133,  153,  160 
Pace,  C.N.,  40 

Pair  density  distribution  function  (PDDF), 

104,  105 
Particle  tracking,  350 
Partner  binding  domain,  353 
Passive  unwinding.  See  DNA  unwinding 

mechanism 
Pauling,  L.,  6,  99 

PDDF.  See  Pair  density  distribution  function 
(PDDF) 

PELDOR.  See  Pulsed  electron  double 

resonance  (PELDOR) 
Penefsky,  H.S.,  326 
Pepsin,  239 
Peptide 

Biemann's  nomenclature,  219,  220 
covalent  structure 

ESI  mass  spectrum,  229,  230 
fragmentation,  MALDI  MS,  229,  230 
glycopeptides,  231 
MS/MS  methods,  231 
oligonucleotide  sequencing,  229,  231 
polypeptide  sequencing:,  227-229 
PTM,  229 
ionization  technique,  216 
isotopic  distributions,  219 
Peripheral  stalk,  317 
Periplasm,  331,  332 
Perutz,  M.F.,  4-6 

PFGs.  See  Pulsed  field  gradients(PFGs) 
Phase  problem 

molecular  replacement,  97 

neutron  diffraction,  108 

X-ray  crystallography,  94 
pH  gradient,  329,  332 
Phosphate  (Pi),  346,  349 
Phosphorylation,  353 
Photodetector,  261,  267 
Physiological  function,  352-354 
Piconewton,  260,  269 
Pi  release  from  myosin,  349,  354 
PISEMA  experiment,  158 
Pixel  array  detectors,  368 
Plasma  membrane,  314,  316 


Index 


393 


P-loop,  348,  349 
Polarization,  spin 

chlorophyll  triplet  state  spectra,  207 

EPR  spectra,  192 

transfer,  141 
Polyprotein,  264 
Polysaccharides,  18,  20 
Posttranslational  modification  (PTM),  229 
Powerstroke 

force-generating  step,  343 

up-to-down  movement,  348 
Prestrelski,  S.J.,  65 
Primase,  307-308 
Priming,  346,  348-350 
Probe,  NMR 

cryogenically-cooled,  153 

magic  angle  spinning  (for  solid-state 
NMR),  139,  141 
Processivity 

ATP  binding,  347 

myosin  5  molecules,  354 

nucleotide-induced  actomyosin,  349 
Protein 

NMR  spectroscopy,  321-322 
thylakoid  lumen,  316 
Protein  aggregation 
DLS,  78 

MS  analysis,  249 

stages,  234 
Protein  conformation,  46,  49,  50,  54 
Protein  Data  Bank  (PDB) 

molecular  replacement,  97 

software/applications,  92 

structural  analysis,  92 

X-ray  structure,  96 
Protein  folding 

energetics,  26-27 

molecular  mechanisms,  264 

optical  tweezers,  269 

time-resolved  fluorescence  spectroscopy,  53 
Protein  pharmaceutics,  27-28 
Proteins 

chemical  cross-linking 

heterobifunctional  cross-linkers,  243 
monofunctional  (or  zero-length) 

cross-linkers,  243 
MS  HDX  biochemical  technique,  243 
multi-protein  complex,  244 
tools,  bioinformatics,  245 
and  covalent  structure,  polypeptides, 

227-23 1 
HDX,  236-238 

ionic  charge  state  distribution,  234-236 


Protein  stability,  58 
Protein  therapeutics 

biophysical  techniques,  373 

commercial  production,  377 

formulation  development,  376 

instrumentation  and  applications,  374 

lifecycle,  stages,  373 

online  light  scattering  analysis,  375 

primary  attributes,  375 

production  process  and  formulation,  375 

protein  characteristics,  374 

Raman  spectroscopy,  375 

statistical  calculations  and  multivariate 
analysis,  377 
Proteobacteria,  314 
Proteolipid  ring 

atomic  resolution  detail,  318 

proton  channel,  322 

SDS-resistant,  320 
Proteomics,  372 
Proto-eukaryote,  314 
Proton  channel,  322,  326 
Proton  concentration  gradient,  342 
Proton  gradient,  316,  329 
Proton  motive  force,  316 
Proton  pump 

eukaryotic  V-ATPase,  317 

sodium  ions,  316 
Proximity  map(s),  243 
Pseudocontact  shifts,  124 
PTM.  See  Posttranslational  modification 
(PTM) 

Pulsed  electron  double  resonance 

(PELDOR),  190 
Pulsed  EPR 

electronic  spin-spin  coupling,  190 

ENDOR,  190 

ESEEM,  189,  190 

magnetic  resonance  experiments,  186 

relaxation,  189 

spin-echo  methods,  188 

types,  188 
Pulsed  field  gradients(PFGs),131-132 
Pulse,  radio  frequency  (RF) 

rotor- synchronized  (in  SSNMR),  141 

selective,  141 
Pyrene  fluorescence,  348,  349 

Q 

QELS.  See  Quasi  elastic  light  scattering 
(QELS) 

Quadrupole  ion  trap  MS.  See  Ion  trap  MS 


394 


Index 


Quadrupole  MS 

ion  trapping  devices,  223 

mass  filter,  222,  223 

mass  spectrometers,  222 

MS/MS  experiments,  222 

precursor  ion  scans,  222 
Quadrupole,  nuclear  spin,  115,  152 
Quasi  elastic  light  scattering  (QELS),  76 
Quenched  flow 

Ensemble  DNA  unwinding  assays,  295 

guanosines,  296 

polyacrylamide  gel  electrophoresis,  296 
ssDNA  trap,  299 
Quenched-flow,  346 


R 

Radio  frequency-driven  recoupling  (RFDR) 

experiment,  142 
Radon,  J.,  278 
Ramachander,  R.,  60 
Ramachandran  plot,  96,  101 
Raman  deconvolution,  74 
Raman  microscopy,  74-75 
Raman  spectroscopy 
Cys,  71 

deconvolution,  74 
DNA,  72 
His,  71 

intensity  and  frequency,  69 

microscopy,  74-75 

peptide  bond,  70 

protein  conformation,  72 

protein  dynamics,  73-74 

technologies,  75-76 

Trp,  71 

Tyr,  70 
Rate  constant,  25,  26,  274 
Rate-limiting  step,  346,  348 
Rath,  P.,  67 

RAVE.  See  Regulator  of  H+-ATPases 
of  vacuolar  and  endosomal 
membranes  (RAVE) 

Rayleigh  criterion,  272,  276 

Rayment,  L,  345 

RDC.  See  Residual  dipolar  couplings  (RDC) 

Reactive  sulfhydryl  groups  (SHI  and  SH2),  345 

Recombinant  protein,  348 

Recovery,  347 

REDOR  experiment,  142 

Regulation,  342,  345,  353 

Regulator  of  H+- ATPases  of  vacuolar 

and  endosomal  membranes 

(RAVE),  317 


Regulatory  light  chain  (RLC),  343 
Relaxation,  nuclear  spin,  119 
Relay  loop,  348 
Replication 

unwinding  assay  (see  Unwinding  assay) 
Residual  dipolar  couplings  (RDC) 

characterizations,  138 

D-coupling,  137 

macromolecules,  138 

orienting  media,  138 

solid-state  NMR,  137 
Resolution 

definition,  95 

mass  (see  Mass  resolution) 
Reversibility,  346,  347 
Reversible  disassembly 

(or  dissociation),  317 
Reversible  reaction,  27 

-^factors  •>  95 
tffree,  95,  96 

Ribonucleic  acid  (RNA) 
building  blocks,  19 
and  DNA,  18 

three-dimensional  structures,  21 
Ribosome,  12 

catalytic  activity,  343 

catalytic  residue,  348 
Right  handed  coiled  coil,  321,  323 
Rigor-like  state,  346 
Rigor  state,  346 
Ring  current,  122,  165 
R       95  96 
Robinson,  C.V,  371 
Robotic  systems,  375 
Rod,  343-344 

Rotary  (molecular)  motor,  315 
Rotary  motor  ATPase 

cryo  electron  microscopy  maps,  332 

family,  314-315 

function,  315-317 

mechanism,  catalysis,  325-332 

structure,  317-325 
Rotating  frame  NOE  spectroscopy  (ROESY) 

experiment,  136 
Rotational  catalysis 

biophysics,  326-327 

and  elastic  energy  coupling,  330 
Rotor 

atomic  resolution  X-ray  structures,  321 
timescale  of  catalysis,  326 
X-ray  crystallography,  318-321 

Rubredoxin,  108,  109 

Rulers,  molecular,  8,  10 

RW0Tk,  95,  96 


Index 


395 


S 

Sako,  Y.,  61 

Sarcomeric  myosin  2,  345 
Sawtooth  pattern,  length-clamp  mode,  264,  265 
SAXS.  See  Small  angle  X-ray  scattering 
(SAXS) 

Scattering 

cross  section,  109 

lengths,  107,  108 
Schrodinger,  E.,  4,  7 

SEC.  See  Size  exclusion  chromatography 

(SEC) 
Sensory  function,  353 
Sequence,  amino  acid,  229,  231 
Sequence  analysis,  344 
Serine  proteases-mechanism,  108,  109 
Sethuraman,  A.,  68 
Sharma,  S.,  68 
Sharma,  V.K.,  60 

Shielding,  nuclear  and  chemical  shift, 

121-123 
Shih,  W.M.,  58 
Shrodinger,  E.,  4 
Single  molecule  (SM) 
AFM,  329 

A/V-ATPase  sector,  329 

experimental  setup,  327,  328 

FjFo-ATP  synthase,  327 

FRET,  329-330 
Single  molecule  fluorescence  methods 

biological  applications,  273-274 

description,  270 

emitted  fluorescence,  270 

experimental  setups,  270-271 

FRET,  272 

principles,  270 

"super-resolution,"  273 
Single  molecule  fluorescence  resonance 
energy  transfer  (FRET) 
spectroscopy 

Escherichia  coli  Rep,  273 

Tetrahymena  ribozyme,  274 
Single-molecule  force  spectroscopy  (SMFS) 

apparatus  and  control  electronics,  264-265 

avidin,  266 

description,  263-264 
Single  molecule  Forster  resonance  energy 

transfer  (smFRET),  329 
Single  molecule  methods 

AFM  (see  Atomic  force  microscopy 
(AFM)) 

cryo-EM,  275-281 

description,  258 

features,  258,  259 

forces  range,  258,  260 


molecular  tweezers,  10 
optical  tweezers,  266-269 
single-molecule  fluorescence  methods, 
270-274 

Single-molecule  motility,  351-354 
Single  molecule  observation  of  F-ATPase, 

327-329 
Single  molecule  unwinding  assays 

FRET  (see  Fluorescence  resonance  energy 

transfer  (FRET)) 
magnetic  tweezers 

piconewton  forces,  309 
single  molecule  methods,  309 
T7  helicase,  308 
optical  trap,  309 
Single-particle  cryo-electron  microscopy 
biomolecules,  277-278 
electron  microscope,  275-276 
image  processing,  279-280 
Rayleigh  criterion,  277 
techniques,  280-281 
X-ray  crystallography,  275 
Single  strand  binding  (SSB)  protein 
E.  coli,  305 
interaction  energy,  307 
replisome,  305,  306 
Single  strand  DNA  (ssDNA) 
and  dsDNA,  292 
gel-based  assay,  298 
strand  exclusion  model,  292,  293 
translocation 

description,  303 
dTTP  concentrations,  305,  307 
helicase' s  speed,  309 
NTP  binding  and  hydrolysis,  303 
Size  exclusion  chromatography  (SEC), 

60,  62,  375 
Skeletal  muscle,  345,  346,  354 
Sliding  filament  model,  343 
Sliding  velocity,  351 
SM.  See  Single  molecule  (SM) 
Small  angle  neutron  scattering,  109-110 
Small-angle  X-ray  scattering  (SAXS) 
experimental  requirements,  103 
Guinier  approximation,  104 
Kratky  plot,  104,  105 
low-resolution  three-dimensional 

models,  106 
neutron  low-angle  scattering,  103 
neutron  small-angle  scattering,  109 
NSD,  107 
PDDF,  104 

quality,  structural  models,  107 

scattered  radiation,  92 

structural  determination,  RNA,  106 


396 


Index 


smFRET.  See  Single  molecule  Forster 
resonance  energy  transfer 
(smFRET) 
Smooth  muscle,  345,  353,  354 
Solid-state  fluorescence 

intrinsic  fluorescence,  60,  61 

proteins,  60 

SEC,  60,  62 
Solid-state  NMR  (SSNMR),  139 
Sosa,  L.D.V.,  68 
Spallation  neutron  sources,  369 
Spectral  density,  149,  150 
Spectroscopic  probes,  345,  348 
Spectroscopy 

absorption,  8 

fluorescent,  8 

mechanochemical  cycle,  346 
Spin-lattice  relaxation,  119 
Spin  locking,  126,  152 
Spin-orbit  coupling 

metal  centers,  iron  proteins,  200,  202 
orbital  angular  momentum.,  198 
organic  cofactor  radicals 
photosystem  I  (PS  I),  199 
Q-band  EPR  spectrum,  200,  201 
Spin- spin  relaxation 

D-modulated  interaction,  133 
heteronuclear  relaxation,  149-152 
large  molecules,  150 
Spliceosomes,  367 
Spontaneous  reaction,  24,  26 
SRCD.  See  Synchrotron  radiation  circular 

dichroism  (SRCD) 
SSB  protein.  See  Single  strand  binding  (SSB) 
protein 

SSNMR.  See  Solid-state  NMR  (SSNMR) 
Stacking  interactions,  23 
Static  light  scattering 

applications,  80-81 

and  DLS,  76 
Stator  stalk 

F-ATPase,  321 

statistical  image  analysis,  323 
V-ATPase,  325 

X-ray  crystallography,  318-321 
Stepping  model  of  DNA  unwinding.  See  DNA 

unwinding 
Stereochemistry  of  models,  96 
Stigler,  J.,  269 
Stopped  flow 

KinTek,  295 

quenched-flow  instruments,  296 
Strand  exclusion  model  of  unwinding 
hexameric  helicases,  292,  293 
translocation,  292 


Structural  biology,  248,  249 
Structural  genomics,  368 
Structural  state,  346-350 
Structure 

factor  amplitude,  93-94 

molecular  and  supramolecular,  myosin, 

343-346 
myosin  head,  348 
refinement,  95,  97 
secondary 

nucleic  acid,  74 
protein,  65 
tertiary 

de  novo  characterization,  144 
protein  folding  and  function,  60 
Structure-function  relationship,  342 
Subfragment  1  (SI),  343 
Subfragment  2  (S2),  343 
Sulfhydryl  groups,  270 
Supramolecular  assembly,  342 
Sweeney,  H.L.,  354 
Swinging  crossbridge  theory,  345 
Swinging  lever  hypothesis,  351 
Switch- 1,348,  349 
Switch-2,  348,  349 
Synchrotron,  368,  369 
Synchrotron  radiation 

photon-counting  devices,  100 

scattering  techniques,  97 

X-ray  beams,  100 
Synchrotron  radiation  circular  dichroism 

(SRCD),  50 
Synthetic  biology,  372,  378 
Szent-Gyorgyi,  A.G.,  342,  350 


T 

Tail,  343,  353 

Tanaka,  K.,  218 

Taylor,  E.W.,  346-348,  354 

Tension,  350,  351 

Thakkar,  S.V.,  58 

Therapeutic  protein  development. 

See  Protein  therapeutics 
Therapeutics,  14 
Thermal  neutrons,  107 
Thermodynamic  coupling,  349 
Thermodynamics 
coupling,  349 

and  kinetic  principles,  24-25 

Thick  filament,  343,  347 

Thin  filament,  343 

Time-of-flight  (TOF)  MS 

ESI  mass  spectrum,  tRNATHR,  229 
and  hybrid  quadrupole,  224-225 


Index 


397 


Time-resolved  fluorescence  anisotropy 

decay,  344 
Time  resolved  X-ray  crystallography,  98 
TIRE  See  Total  internal  reflection 

fluorescence  microscopy 

Titin,  263 

TMV.  See  Tobacco  mosaic  virus  (TMV) 
Tobacco  mosaic  virus  (TMV),  102,  103 
TOF  MS.  See  Time- of- flight  (TOF)  MS 
Total  internal  reflectance  microscopy 

(TIRFM),  10 
Total  internal  reflection  fluorescence 
microscopy  (TIRF),  352 
evanescent  wave,  270-271 
immobilized  molecules,  273 
Totally  correlated  spectroscopy 
(TOCSY)  experiment 
advantages,  126 

andNOESY  experiment,  134,  135 
Track,  myosin  molecules,  352-354 
Transducer,  349 
Transferrin  (Tf),  241 
Transient  EPR,  190-192 
Transient  kinetics,  346 
Translocation,  352,  354 
Transmembrane,  315,  320,  326 
Transverse  relaxation  optimized  spectroscopy 

(TROSY)  selection,  124,  125, 

144,  159,  161 
Tl  relaxation.  See  Spin-lattice  relaxation 
T2  relaxation.  See  Spin- spin  relaxation 
Trentham,  D.R.,  346,  348 
Tropomyosin,  353 
Troponin,  353 
Tryptophan  (Trp),  346 
Tubulin,  342 

U 

Umbrella  method,  350 
Unconventional  myosins,  352,  353 
Unwinding  assay 

ensemble  (see  Ensemble  unwinding  assays) 

gel-based,  296-297 

kinetic  trace  (see  Kinetics) 

real  time,  296 
Upper  50  kDa  subdomain  (U50),  345 
UV  absorbance,  42 


V 

Vacuolar  ATPase  (V-ATPase) 

endomembrane  system  of  all  eukaryotic 
cells  b,  314 


proton  pump,  317 

single  molecule  rotation  experiments,  329 

V-  ATPase.  See  Vacuolar  ATPase  (V- ATPase) 

VI-  ATPase,  324,  325 

VCD.  See  Vibrational  circular  dichroism 
(VCD) 

Vibrational  circular  dichroism  (VCD),  50 
Virus 

and  bacteria,  266 

icosahedral,  281 
Visscher,  K.,  269 
V0-proton  channel,  322 
V!V0H+-ATPase,  109 


W 

Wada,  A.,  27 
Walker,  J.E.,  326 
Wallace,  B.A.,  44 

WATERGATE  solvent  suppression,  159 

Watson,  J.D.,  4,  5,  19,  99 

Weber,  A.,  342 

Wen,  Z.,  75 

Wilkins,  M.,  4,  5 

Wong,  S.S.,  243 

Wyatt,  P.J.,  82 

X 

X-ray  crystallography 
atomic  resolution,  318 
catalytic  and  proteolipid  domains, 
318,319 

mitochondria-rich  animal  tissues,  318 
transmembrane  segments,  320 
X-ray  diffraction 
fiber,  99-103 

noncrystalline  materials,  99 
X-ray  scattering 

neutron  (see  Neutron  scattering) 
theory,  91-92 

Y 

Yu,  S.,  67 
Z 

Zeeman  splitting 

Boltzmann  population  difference,  130 
molecular  triplet  states,  206-207 
nondegenerate  spin  states,  114,  115 
non-Kramer's  systems,  183 

Zscherp,  C,  66 


